Normalized to: Estevez, P.
[1]
oai:arXiv.org:1709.03541 [pdf] - 1685055
Robust period estimation using mutual information for multi-band light
curves in the synoptic survey era
Submitted: 2017-09-11
The Large Synoptic Survey Telescope (LSST) will produce an unprecedented
amount of light curves using six optical bands. Robust and efficient methods
that can aggregate data from multidimensional sparsely-sampled time series are
needed. In this paper we present a new method for light curve period estimation
based on the quadratic mutual information (QMI). The proposed method does not
assume a particular model for the light curve nor its underlying probability
density and it is robust to non-Gaussian noise and outliers. By combining the
QMI from several bands the true period can be estimated even when no
single-band QMI yields the period. Period recovery performance as a function of
average magnitude and sample size is measured using 30,000 synthetic multi-band
light curves of RR Lyrae and Cepheid variables generated by the LSST Operations
and Catalog simulators. The results show that aggregating information from
several bands is highly beneficial in LSST sparsely-sampled time series,
obtaining an absolute increase in period recovery rate up to 50%. We also show
that the QMI is more robust to noise and light curve length (sample size) than
the multiband generalizations of the Lomb Scargle and Analysis of Variance
periodograms, recovering the true period in 10-30% more cases than its
competitors. A python package containing efficient Cython implementations of
the QMI and other methods is provided.
[2]
oai:arXiv.org:1509.07823 [pdf] - 1283333
Computational Intelligence Challenges and Applications on Large-Scale
Astronomical Time Series Databases
Submitted: 2015-09-25
Time-domain astronomy (TDA) is facing a paradigm shift caused by the
exponential growth of the sample size, data complexity and data generation
rates of new astronomical sky surveys. For example, the Large Synoptic Survey
Telescope (LSST), which will begin operations in northern Chile in 2022, will
generate a nearly 150 Petabyte imaging dataset of the southern hemisphere sky.
The LSST will stream data at rates of 2 Terabytes per hour, effectively
capturing an unprecedented movie of the sky. The LSST is expected not only to
improve our understanding of time-varying astrophysical objects, but also to
reveal a plethora of yet unknown faint and fast-varying phenomena. To cope with
a change of paradigm to data-driven astronomy, the fields of astroinformatics
and astrostatistics have been created recently. The new data-oriented paradigms
for astronomy combine statistics, data mining, knowledge discovery, machine
learning and computational intelligence, in order to provide the automated and
robust methods needed for the rapid detection and classification of known
astrophysical objects as well as the unsupervised characterization of novel
phenomena. In this article we present an overview of machine learning and
computational intelligence applications to TDA. Future big data challenges and
new lines of research in TDA, focusing on the LSST, are identified and
discussed from the viewpoint of computational intelligence/machine learning.
Interdisciplinary collaboration will be required to cope with the challenges
posed by the deluge of astronomical data coming from the LSST.
[3]
oai:arXiv.org:1509.07093 [pdf] - 1282318
A review of learning vector quantization classifiers
Submitted: 2015-09-23
In this work we present a review of the state of the art of Learning Vector
Quantization (LVQ) classifiers. A taxonomy is proposed which integrates the
most relevant LVQ approaches to date. The main concepts associated with modern
LVQ approaches are defined. A comparison is made among eleven LVQ classifiers
using one real-world and two artificial datasets.
[4]
oai:arXiv.org:1412.1840 [pdf] - 1282187
A Novel, Fully Automated Pipeline for Period Estimation in the EROS 2
Data Set
Submitted: 2014-12-04
We present a new method to discriminate periodic from non-periodic
irregularly sampled lightcurves. We introduce a periodic kernel and maximize a
similarity measure derived from information theory to estimate the periods and
a discriminator factor. We tested the method on a dataset containing 100,000
synthetic periodic and non-periodic lightcurves with various periods,
amplitudes and shapes generated using a multivariate generative model. We
correctly identified periodic and non-periodic lightcurves with a completeness
of 90% and a precision of 95%, for lightcurves with a signal-to-noise ratio
(SNR) larger than 0.5. We characterize the efficiency and reliability of the
model using these synthetic lightcurves and applied the method on the EROS-2
dataset. A crucial consideration is the speed at which the method can be
executed. Using hierarchical search and some simplification on the parameter
search we were able to analyze 32.8 million lightcurves in 18 hours on a
cluster of GPGPUs. Using the sensitivity analysis on the synthetic dataset, we
infer that 0.42% in the LMC and 0.61% in the SMC of the sources show periodic
behavior. The training set, the catalogs and source code are all available in
http://timemachine.iic.harvard.edu.
[5]
oai:arXiv.org:1212.2398 [pdf] - 903316
An Information Theoretic Algorithm for Finding Periodicities in Stellar
Light Curves
Submitted: 2012-12-11
We propose a new information theoretic metric for finding periodicities in
stellar light curves. Light curves are astronomical time series of brightness
over time, and are characterized as being noisy and unevenly sampled. The
proposed metric combines correntropy (generalized correlation) with a periodic
kernel to measure similarity among samples separated by a given period. The new
metric provides a periodogram, called Correntropy Kernelized Periodogram (CKP),
whose peaks are associated with the fundamental frequencies present in the
data. The CKP does not require any resampling, slotting or folding scheme as it
is computed directly from the available samples. CKP is the main part of a
fully-automated pipeline for periodic light curve discrimination to be used in
astronomical survey databases. We show that the CKP method outperformed the
slotted correntropy, and conventional methods used in astronomy for periodicity
discrimination and period estimation tasks, using a set of light curves drawn
from the MACHO survey. The proposed metric achieved 97.2% of true positives
with 0% of false positives at the confidence level of 99% for the periodicity
discrimination task; and 88% of hits with 11.6% of multiples and 0.4% of misses
in the period estimation task.