Normalized to: Lyon, R.
[1]
oai:arXiv.org:2002.12386 [pdf] - 2065446
Imbalance Learning for Variable Star Classification
Submitted: 2020-02-27
The accurate automated classification of variable stars into their respective
sub-types is difficult. Machine learning based solutions often fall foul of the
imbalanced learning problem, which causes poor generalisation performance in
practice, especially on rare variable star sub-types. In previous work, we
attempted to overcome such deficiencies via the development of a hierarchical
machine learning classifier. This 'algorithm-level' approach to tackling
imbalance, yielded promising results on Catalina Real-Time Survey (CRTS) data,
outperforming the binary and multi-class classification schemes previously
applied in this area. In this work, we attempt to further improve hierarchical
classification performance by applying 'data-level' approaches to directly
augment the training data so that they better describe under-represented
classes. We apply and report results for three data augmentation methods in
particular: $\textit{R}$andomly $\textit{A}$ugmented $\textit{S}$ampled
$\textit{L}$ight curves from magnitude $\textit{E}$rror ($\texttt{RASLE}$),
augmenting light curves with Gaussian Process modelling ($\texttt{GpFit}$) and
the Synthetic Minority Over-sampling Technique ($\texttt{SMOTE}$). When
combining the 'algorithm-level' (i.e. the hierarchical scheme) together with
the 'data-level' approach, we further improve variable star classification
accuracy by 1-4$\%$. We found that a higher classification rate is obtained
when using $\texttt{GpFit}$ in the hierarchical model. Further improvement of
the metric scores requires a better standard set of correctly identified
variable stars and, perhaps enhanced features are needed.
[2]
oai:arXiv.org:1907.08189 [pdf] - 1925160
Comparing Multi-class, Binary and Hierarchical Machine Learning
Classification schemes for variable stars
Submitted: 2019-07-18
Upcoming synoptic surveys are set to generate an unprecedented amount of
data. This requires an automatic framework that can quickly and efficiently
provide classification labels for several new object classification challenges.
Using data describing 11 types of variable stars from the Catalina Real-Time
Transient Surveys (CRTS), we illustrate how to capture the most important
information from computed features and describe detailed methods of how to
robustly use Information Theory for feature selection and evaluation. We apply
three Machine Learning (ML) algorithms and demonstrate how to optimize these
classifiers via cross-validation techniques. For the CRTS dataset, we find that
the Random Forest (RF) classifier performs best in terms of balanced-accuracy
and geometric means. We demonstrate substantially improved classification
results by converting the multi-class problem into a binary classification
task, achieving a balanced-accuracy rate of $\sim$99 per cent for the
classification of ${\delta}$-Scuti and Anomalous Cepheids (ACEP). Additionally,
we describe how classification performance can be improved via converting a
'flat-multi-class' problem into a hierarchical taxonomy. We develop a new
hierarchical structure and propose a new set of classification features,
enabling the accurate identification of subtypes of cepheids, RR Lyrae and
eclipsing binary stars in CRTS data.
[3]
oai:arXiv.org:1810.06012 [pdf] - 1766471
A Processing Pipeline for High Volume Pulsar Data Streams
Submitted: 2018-10-14
Pulsar data analysis pipelines have historically been comprised of bespoke
software systems, supporting the off-line analysis of data. However modern data
acquisition systems are making off-line analyses impractical. They often output
multiple simultaneous high volume data streams, significantly increasing data
capture rates. This leads to the accumulation of large data volumes, which are
prohibitively expensive to retain. To maintain processing capabilities when
off-line analysis becomes infeasible due to cost, requires a shift to on-line
data processing. This paper makes four contributions facilitating this shift
with respect to the search for radio pulsars: i) it characterises for the
modern era, the key components of a pulsar search science (not signal
processing) pipeline, ii) it examines the feasibility of implementing on-line
pulsar search via existing tools, iii) problems preventing an easy transition
to on-line search are identified and explained, and finally iv) it provides the
design for a new prototype pipeline capable of overcoming such problems.
Realised using Commercial off-the-shelf (COTS) software components, the
deployable system is open source, simple, scalable, and cheap to produce. It
has the potential to achieve pulsar search design requirements for the Square
Kilometre Array (SKA), illustrated via testing under simulated SKA loads.
[4]
oai:arXiv.org:1808.05424 [pdf] - 1734317
Single-pulse classifier for the LOFAR Tied-Array All-sky Survey
Michilli, D.;
Hessels, J. W. T.;
Lyon, R. J.;
Tan, C. M.;
Bassa, C.;
Cooper, S.;
Kondratiev, V. I.;
Sanidas, S.;
Stappers, B. W.;
van Leeuwen, J.
Submitted: 2018-08-16
Searches for millisecond-duration, dispersed single pulses have become a
standard tool used during radio pulsar surveys in the last decade. They have
enabled the discovery of two new classes of sources: rotating radio transients
and fast radio bursts. However, we are now in a regime where the sensitivity to
single pulses in radio surveys is often limited more by the strong background
of radio frequency interference (RFI, which can greatly increase the
false-positive rate) than by the sensitivity of the telescope itself. To
mitigate this problem, we introduce the Single-pulse Searcher (SpS). This is a
new machine-learning classifier designed to identify astrophysical signals in a
strong RFI environment, and optimized to process the large data volumes
produced by the new generation of aperture array telescopes. It has been
specifically developed for the LOFAR Tied-Array All-Sky Survey (LOTAAS), an
ongoing survey for pulsars and fast radio transients in the northern
hemisphere. During its development, SpS discovered 7 new pulsars and blindly
identified ~80 known sources. The modular design of the software offers the
possibility to easily adapt it to other studies with different instruments and
characteristics. Indeed, SpS has already been used in other projects, e.g. to
identify pulses from the fast radio burst source FRB 121102. The software
development is complete and SpS is now being used to re-process all LOTAAS data
collected to date.
[5]
oai:arXiv.org:1712.01008 [pdf] - 1736241
Pulsar Searches with the SKA
Levin, L.;
Armour, W.;
Baffa, C.;
Barr, E.;
Cooper, S.;
Eatough, R.;
Ensor, A.;
Giani, E.;
Karastergiou, A.;
Karuppusamy, R.;
Keith, M.;
Kramer, M.;
Lyon, R.;
Mackintosh, M.;
Mickaliger, M.;
van Nieuwpoort, R;
Pearson, M.;
Prabu, T.;
Roy, J.;
Sinnen, O.;
Spitler, L.;
Spreeuw, H.;
Stappers, B. W.;
van Straten, W.;
Williams, C.;
Wang, H.;
Wiesner, K.
Submitted: 2017-12-04
The Square Kilometre Array will be an amazing instrument for pulsar
astronomy. While the full SKA will be sensitive enough to detect all pulsars in
the Galaxy visible from Earth, already with SKA1, pulsar searches will discover
enough pulsars to increase the currently known population by a factor of four,
no doubt including a range of amazing unknown sources. Real time processing is
needed to deal with the 60 PB of pulsar search data collected per day, using a
signal processing pipeline required to perform more than 10 POps. Here we
present the suggested design of the pulsar search engine for the SKA and
discuss challenges and solutions to the pulsar search venture.
[6]
oai:arXiv.org:1710.03513 [pdf] - 1736195
Fifty Years of Candidate Pulsar Selection - What next?
Submitted: 2017-10-10
For fifty years astronomers have been searching for pulsar signals in
observational data. Throughout this time the process of choosing detections
worthy of investigation, so called candidate selection, has been effective,
yielding thousands of pulsar discoveries. Yet in recent years technological
advances have permitted the proliferation of pulsar-like candidates, straining
our candidate selection capabilities, and ultimately reducing selection
accuracy. To overcome such problems, we now apply intelligent machine learning
tools. Whilst these have achieved success, candidate volumes continue to
increase, and our methods have to evolve to keep pace with the change. This
talk considers how to meet this challenge as a community.
[7]
oai:arXiv.org:1603.05166 [pdf] - 1396952
Fifty Years of Pulsar Candidate Selection: From simple filters to a new
principled real-time classification approach
Submitted: 2016-03-16
Improving survey specifications are causing an exponential rise in pulsar
candidate numbers and data volumes. We study the candidate filters used to
mitigate these problems during the past fifty years. We find that some existing
methods such as applying constraints on the total number of candidates
collected per observation, may have detrimental effects on the success of
pulsar searches. Those methods immune to such effects are found to be
ill-equipped to deal with the problems associated with increasing data volumes
and candidate numbers, motivating the development of new approaches. We
therefore present a new method designed for on-line operation. It selects
promising candidates using a purpose-built tree-based machine learning
classifier, the Gaussian Hellinger Very Fast Decision Tree (GH-VFDT), and a new
set of features for describing candidates. The features have been chosen so as
to i) maximise the separation between candidates arising from noise and those
of probable astrophysical origin, and ii) be as survey-independent as possible.
Using these features our new approach can process millions of candidates in
seconds (~1 million every 15 seconds), with high levels of pulsar recall
(90%+). This technique is therefore applicable to the large volumes of data
expected to be produced by the Square Kilometre Array (SKA). Use of this
approach has assisted in the discovery of 20 new pulsars in data obtained
during the LOFAR Tied-Array All-Sky Survey (LOTAAS).
[8]
oai:arXiv.org:1504.05747 [pdf] - 983602
Phase-Occultation Nulling Coronagraphy
Submitted: 2015-04-22
The search for life via characterization of earth-like planets in the
habitable zone is one of the key scientific objectives in Astronomy. We
describe a new phase-occulting (PO) interferometric nulling coronagraphy (NC)
approach. The PO-NC approach employs beamwalk and freeform optical surfaces
internal to the interferometer cavity to introduce a radially dependent plate
scale difference between each interferometer arm (optical path) that nulls the
central star at high contrast while transmitting the off-axis field. The design
is readily implemented on segmented-mirror telescope architectures, utilizing a
single nulling interferometer to achieve high throughput, a small inner working
angle (IWA), sixth-order or higher starlight suppression, and full off-axis
discovery space, a combination of features that other coronagraph designs
generally must trade. Unlike previous NC approaches, the PO-NC approach does
not require pupil shearing; this increases throughput and renders it less
sensitive to on-axis common-mode telescope errors, permitting relief of the
observatory stability required to achieve contrast levels of $\leq10^{-10}$.
Observatory operations are also simplified by removing the need for multiple
telescope rolls and shears to construct a high contrast image. The design goals
for a PO nuller are similar to other coronagraphs intended for direct detection
of habitable zone (HZ) exoEarth signal: contrasts on the order of $10^{-10}$ at
an IWA of $\leq3\lambda/D$ over $\geq10$% bandpass with a large ($>10$~m)
segmented aperture space-telescope operating in visible and near infrared
bands. This work presents an introduction to the PO nulling coronagraphy
approach based on its Visible Nulling Coronagraph (VNC) heritage and relation
to the radial shearing interferometer.
[9]
oai:arXiv.org:1405.2278 [pdf] - 821796
Hellinger Distance Trees for Imbalanced Streams
Submitted: 2014-05-09
Classifiers trained on data sets possessing an imbalanced class distribution
are known to exhibit poor generalisation performance. This is known as the
imbalanced learning problem. The problem becomes particularly acute when we
consider incremental classifiers operating on imbalanced data streams,
especially when the learning objective is rare class identification. As
accuracy may provide a misleading impression of performance on imbalanced data,
existing stream classifiers based on accuracy can suffer poor minority class
performance on imbalanced streams, with the result being low minority class
recall rates. In this paper we address this deficiency by proposing the use of
the Hellinger distance measure, as a very fast decision tree split criterion.
We demonstrate that by using Hellinger a statistically significant improvement
in recall rates on imbalanced data streams can be achieved, with an acceptable
increase in the false positive rate.
[10]
oai:arXiv.org:1307.8012 [pdf] - 1515668
A Study on Classification in Imbalanced and Partially-Labelled Data
Streams
Submitted: 2013-07-30
The domain of radio astronomy is currently facing significant computational
challenges, foremost amongst which are those posed by the development of the
world's largest radio telescope, the Square Kilometre Array (SKA). Preliminary
specifications for this instrument suggest that the final design will
incorporate between 2000 and 3000 individual 15 metre receiving dishes, which
together can be expected to produce a data rate of many TB/s. Given such a high
data rate, it becomes crucial to consider how this information will be
processed and stored to maximise its scientific utility. In this paper, we
consider one possible data processing scenario for the SKA, for the purposes of
an all-sky pulsar survey. In particular we treat the selection of promising
signals from the SKA processing pipeline as a data stream classification
problem. We consider the feasibility of classifying signals that arrive via an
unlabelled and heavily class imbalanced data stream, using currently available
algorithms and frameworks. Our results indicate that existing stream learners
exhibit unacceptably low recall on real astronomical data when used in standard
configuration; however, good false positive performance and comparable accuracy
to static learners, suggests they have definite potential as an on-line
solution to this particular big data challenge.
[11]
oai:arXiv.org:1011.5214 [pdf] - 268232
Stellar Imager (SI): developing and testing a predictive dynamo model
for the Sun by imaging other stars
Carpenter, Kenneth G.;
Schrijver, Carolus J.;
Karovska, Margarita;
Kraemer, Steve;
Lyon, Richard;
Mozurkewich, David;
Airapetian, Vladimir;
Adams, John C.;
Allen, Ronald J.;
Brown, Alex;
Bruhweiler, Fred;
Conti, Alberto;
Christensen-Dalsgaard, Joergen;
Cranmer, Steve;
Cuntz, Manfred;
Danchi, William;
Dupree, Andrea;
Elvis, Martin;
Evans, Nancy;
Giampapa, Mark;
Harper, Graham;
Hartman, Kathy;
Labeyrie, Antoine;
Leitner, Jesse;
Lillie, Chuck;
Linsky, Jeffrey L.;
Lo, Amy;
Mighell, Ken;
Miller, David;
Noecker, Charlie;
Parrish, Joe;
Phillips, Jim;
Rimmele, Thomas;
Saar, Steve;
Sasselov, Dimitar;
Stahl, H. Philip;
Stoneking, Eric;
Strassmeier, Klaus;
Walter, Frederick;
Windhorst, Rogier;
Woodgate, Bruce;
Woodruff, Robert
Submitted: 2010-11-23
The Stellar Imager mission concept is a space-based UV/Optical interferometer
designed to resolve surface magnetic activity and subsurface structure and
flows of a population of Sun-like stars, in order to accelerate the development
and validation of a predictive dynamo model for the Sun and enable accurate
long-term forecasting of solar/stellar magnetic activity.
[12]
oai:arXiv.org:0904.0941 [pdf] - 542628
Advanced Technology Large-Aperture Space Telescope (ATLAST): A
Technology Roadmap for the Next Decade
Postman, Marc;
Argabright, Vic;
Arnold, Bill;
Aronstein, David;
Atcheson, Paul;
Blouke, Morley;
Brown, Tom;
Calzetti, Daniela;
Cash, Webster;
Clampin, Mark;
Content, Dave;
Dailey, Dean;
Danner, Rolf;
Doxsey, Rodger;
Ebbets, Dennis;
Eisenhardt, Peter;
Feinberg, Lee;
Fruchter, Andrew;
Giavalisco, Mauro;
Glassman, Tiffany;
Gong, Qian;
Green, James;
Grunsfeld, John;
Gull, Ted;
Hickey, Greg;
Hopkins, Randall;
Hraba, John;
Hyde, Tupper;
Jordan, Ian;
Kasdin, Jeremy;
Kendrick, Steve;
Kilston, Steve;
Koekemoer, Anton;
Korechoff, Bob;
Krist, John;
Mather, John;
Lillie, Chuck;
Lo, Amy;
Lyon, Rick;
McCullough, Peter;
Mosier, Gary;
Mountain, Matt;
Oegerle, Bill;
Pasquale, Bert;
Purves, Lloyd;
Penera, Cecelia;
Polidan, Ron;
Redding, Dave;
Sahu, Kailash;
Saif, Babak;
Sembach, Ken;
Shull, Mike;
Smith, Scott;
Sonneborn, George;
Spergel, David;
Stahl, Phil;
Stapelfeldt, Karl;
Thronson, Harley;
Thronton, Gary;
Townsend, Jackie;
Traub, Wesley;
Unwin, Steve;
Valenti, Jeff;
Vanderbei, Robert;
Werner, Michael;
Wesenberg, Richard;
Wiseman, Jennifer;
Woodgate, Bruce
Submitted: 2009-04-06, last modified: 2009-05-08
The Advanced Technology Large-Aperture Space Telescope (ATLAST) is a set of
mission concepts for the next generation of UVOIR space observatory with a
primary aperture diameter in the 8-m to 16-m range that will allow us to
perform some of the most challenging observations to answer some of our most
compelling questions, including "Is there life elsewhere in the Galaxy?" We
have identified two different telescope architectures, but with similar optical
designs, that span the range in viable technologies. The architectures are a
telescope with a monolithic primary mirror and two variations of a telescope
with a large segmented primary mirror. This approach provides us with several
pathways to realizing the mission, which will be narrowed to one as our
technology development progresses. The concepts invoke heritage from HST and
JWST design, but also take significant departures from these designs to
minimize complexity, mass, or both.
Our report provides details on the mission concepts, shows the extraordinary
scientific progress they would enable, and describes the most important
technology development items. These are the mirrors, the detectors, and the
high-contrast imaging technologies, whether internal to the observatory, or
using an external occulter. Experience with JWST has shown that determined
competitors, motivated by the development contracts and flight opportunities of
the new observatory, are capable of achieving huge advances in technical and
operational performance while keeping construction costs on the same scale as
prior great observatories.
[13]
oai:arXiv.org:0712.1105 [pdf] - 7841
Externally Occulted Terrestrial Planet Finder Coronagraph: Simulations
and Sensitivities
Submitted: 2007-12-07
A multitude of coronagraphic techniques for the space-based direct detection
and characterization of exo-solar terrestrial planets are actively being
pursued by the astronomical community. Typical coronagraphs have internal
shaped focal plane and/or pupil plane occulting masks which block and/or
diffract starlight thereby increasing the planet's contrast with respect to its
parent star. Past studies have shown that any internal technique is limited by
the ability to sense and control amplitude, phase (wavefront) and polarization
to exquisite levels - necessitating stressing optical requirements. An
alternative and promising technique is to place a starshade, i.e. external
occulter, at some distance in front of the telescope. This starshade suppresses
most of the starlight before entering the telescope - relaxing optical
requirements to that of a more conventional telescope. While an old technique
it has been recently been advanced by the recognition that circularly symmetric
graded apodizers can be well approximated by shaped binary occulting masks.
Indeed optimal shapes have been designed that can achieve smaller inner working
angles than conventional coronagraphs and yet have high effective throughput
allowing smaller aperture telescopes to achieve the same coronagraphic
resolution and similar sensitivity as larger ones.
Herein we report on our ongoing modeling, simulation and optimization of
external occulters and show sensitivity results with respect to number and
shape errors of petals, spectral passband, accuracy of Fresnel propagation, and
show results for both filled and segmented aperture telescopes and discuss
acquisition and sensing of the occulter's location relative to the telescope.
[14]
oai:arXiv.org:astro-ph/0212439 [pdf] - 53829
The Fizeau Interferometer Testbed
Submitted: 2002-12-19
The Fizeau Interferometer Testbed (FIT) is a collaborative effort between
NASA's Goddard Space Flight Center, the Naval Research Laboratory, Sigma Space
Corporation, and the University of Maryland. The testbed will be used to
explore the principles of and the requirements for the full, as well as the
pathfinder, Stellar Imager mission concept. It has a long term goal of
demonstrating closed-loop control of a sparse array of numerous articulated
mirrors to keep optical beams in phase and optimize interferometric synthesis
imaging. In this paper we present the optical and data acquisition system
design of the testbed, and discuss the wavefront sensing and control algorithms
to be used. Currently we have completed the initial design and hardware
procurement for the FIT. The assembly and testing of the Testbed will be
underway at Goddard's Instrument Development Lab in the coming months.
[15]
oai:arXiv.org:astro-ph/0210046 [pdf] - 52084
The Extra-Solar Planet Imager (ESPI)
Nisenson, P.;
Melnick, G. J.;
Geary, J.;
Holman, M.;
Korzennik, S. G.;
Noyes, R. W.;
Papaliolios, C.;
Sasselov, D. D.;
Fischer, D.;
Gezari, D.;
Lyon, R. G.;
Gonsalves, R.;
Hardesty, C.;
Harwit, M.;
Marley, M. S.;
Neufeld, D. A.;
Ridgway, S. T.
Submitted: 2002-10-02
ESPI has been proposed for direct imaging and spectral analysis of giant
planets orbiting solar-type stars. ESPI extends the concept suggested by
Nisenson and Papaliolios (2001) for a square aperture apodized telescope that
has sufficient dynamic range to directly detect exo-planets. With a 1.5 M
square mirror, ESPI can deliver high dynamic range imagery as close as 0.3
arcseconds to bright sources, permitting a sensitive search for exoplanets
around nearby stars and a study of their characteristics in reflected light.
[16]
oai:arXiv.org:astro-ph/0111386 [pdf] - 46152
The X-ray R Aquarii: A Two-sided Jet and Central Source
Submitted: 2001-11-20
We report Chandra ACIS-S3 x-ray imaging and spectroscopy of the R Aquarii
binary system that show a spatially resolved two-sided jet and an unresolved
central source. This is the first published report of such an x-ray jet seen in
an evolved stellar system comprised of ~2-3 solar masses. At E < 1 keV, the
x-ray jet extends both to the northeast and southwest relative to the central
binary system. At 1 < E < 7.1 keV, R Aqr is a point-like source centered on the
star system. While both 3.5-cm radio continuum emission and x-ray emission
appear coincident in projection and have maximum intensities at ~7.5" northeast
of the central binary system, the next strongest x-ray component is located
\~30" southwest of the central binary system and has no radio continuum
counterpart. The x-ray jets are likely shock heated in the recent past, and are
not in thermal equilibrium. The strongest southwest x-ray jet component may
have been shocked recently since there is no relic radio emission as expected
from an older shock. At the position of the central binary, we detect x-ray
emission below 1.6 keV consistent with blackbody emission at T ~2 x 10^6 K. At
the central star there is also a prominent 6.4 keV feature, a possible
fluorescence or collisionally excited Fe K-alpha line from an accretion disk or
from the wind of the giant star. For this excitation to occur, there must be an
unseen hard source of x-rays or particles in the immediate vicinity of the hot
star. Such a source would be hidden from view by the surrounding edge-on
accretion disk.