Normalized to: Pichara, K.
[1]
oai:arXiv.org:2005.05404 [pdf] - 2124844
The VVV Infrared Variability Catalog (VIVA-I)
Lopes, C. E. Ferreira;
Cross, N. J. G.;
Catelan, M.;
Minniti, D.;
Hempel, M.;
Lucas, P. W.;
Angeloni, R.;
Jablonsky, F.;
Braga, V. F.;
Leao, I. C.;
Herpich, F. R.;
Alonso-Garcia, J.;
Papageorgiou, A.;
Pichara, K.;
Saito, R. K.;
Bradley, A.;
Beamin, J. C.;
Cortes, C.;
De Medeiros, J. R.;
Russell, Christopher. M. P.
Submitted: 2020-05-11
Thanks to the VISTA Variables in the Via Lactea (VVV) ESO Public Survey it is
now possible to explore a large number of objects in those regions. This paper
addresses the variability analysis of all VVV point sources having more than 10
observations in VVVDR4 using a novel approach. In total, the near-IR light
curves of 288,378,769 sources were analysed using methods developed in the New
Insight Into Time Series Analysis project. As a result, we present a complete
sample having 44, 998, 752 variable star candidates (VVV-CVSC), which include
accurate individual coordinates, near-IR magnitudes (ZYJHKs), extinctions
A(Ks), variability indices, periods, amplitudes, among other parameters to
assess the science. Unfortunately, a side effect of having a highly complete
sample, is also having a high level of contamination by non-variable
(contamination ratio of non-variables to variables is slightly over 10:1). To
deal with this, we also provide some flags and parameters that can be used by
the community to de-crease the number of variable candidates without heavily
decreasing the completeness of the sample. In particular, we cross-identified
339,601 of our sources with Simbad and AAVSO databases, which provide us with
information for these objects at other wavelegths. This sub-sample constitutes
a unique resource to study the corresponding near-IR variability of known
sources as well as to assess the IR variability related with X-ray and
Gamma-Ray sources. On the other hand, the other 99.5% sources in our sample
constitutes a number of potentially new objects with variability information
for the heavily crowded and reddened regions of the Galactic Plane and Bulge.
The present results also provide an important queryable resource to perform
variability analysis and to characterize ongoing and future surveys like TESS
and LSST.
[2]
oai:arXiv.org:2004.06226 [pdf] - 2077490
Classifying CMB time-ordered data through deep neural networks
Submitted: 2020-04-13
The Cosmic Microwave Background (CMB) has been measured over a wide range of
multipoles. Experiments with arc-minute resolution like the Atacama Cosmology
Telescope (ACT) have contributed to the measurement of primary and secondary
anisotropies, leading to remarkable scientific discoveries. Such findings
require careful data selection in order to remove poorly-behaved detectors and
unwanted contaminants. The current data classification methodology used by ACT
relies on several statistical parameters that are assessed and fine-tuned by an
expert. This method is highly time-consuming and band or season-specific, which
makes it less scalable and efficient for future CMB experiments. In this work,
we propose a supervised machine learning model to classify detectors of CMB
experiments. The model corresponds to a deep convolutional neural network. We
tested our method on real ACT data, using the 2008 season, 148 GHz, as training
set with labels provided by the ACT data selection software. The model learns
to classify time-streams starting directly from the raw data. For the season
and frequency considered during the training, we find that our classifier
reaches a precision of 99.8%. For 220 and 280 GHz data, season 2008, we
obtained 99.4% and 97.5% of precision, respectively. Finally, we performed a
cross-season test over 148 GHz data from 2009 and 2010 for which our model
reaches a precision of 99.8% and 99.5%, respectively. Our model is about 10x
faster than the current pipeline, making it potentially suitable for real-time
implementations.
[3]
oai:arXiv.org:2002.00994 [pdf] - 2046524
Scalable End-to-end Recurrent Neural Network for Variable star
classification
Submitted: 2020-02-03
During the last decade, considerable effort has been made to perform
automatic classification of variable stars using machine learning techniques.
Traditionally, light curves are represented as a vector of descriptors or
features used as input for many algorithms. Some features are computationally
expensive, cannot be updated quickly and hence for large datasets such as the
LSST cannot be applied. Previous work has been done to develop alternative
unsupervised feature extraction algorithms for light curves, but the cost of
doing so still remains high. In this work, we propose an end-to-end algorithm
that automatically learns the representation of light curves that allows an
accurate automatic classification. We study a series of deep learning
architectures based on Recurrent Neural Networks and test them in automated
classification scenarios. Our method uses minimal data preprocessing, can be
updated with a low computational cost for new observations and light curves,
and can scale up to massive datasets. We transform each light curve into an
input matrix representation whose elements are the differences in time and
magnitude, and the outputs are classification probabilities. We test our method
in three surveys: OGLE-III, Gaia and WISE. We obtain accuracies of about $95\%$
in the main classes and $75\%$ in the majority of subclasses. We compare our
results with the Random Forest classifier and obtain competitive accuracies
while being faster and scalable. The analysis shows that the computational
complexity of our approach grows up linearly with the light curve size, while
the traditional approach cost grows as $N\log{(N)}$.
[4]
oai:arXiv.org:1912.02235 [pdf] - 2026504
Streaming Classification of Variable Stars
Submitted: 2019-12-04
In the last years, automatic classification of variable stars has received
substantial attention. Using machine learning techniques for this task has
proven to be quite useful. Typically, machine learning classifiers used for
this task require to have a fixed training set, and the training process is
performed offline. Upcoming surveys such as the Large Synoptic Survey Telescope
(LSST) will generate new observations daily, where an automatic classification
system able to create alerts online will be mandatory. A system with those
characteristics must be able to update itself incrementally. Unfortunately,
after training, most machine learning classifiers do not support the inclusion
of new observations in light curves, they need to re-train from scratch.
Naively re-training from scratch is not an option in streaming settings, mainly
because of the expensive pre-processing routines required to obtain a vector
representation of light curves (features) each time we include new
observations. In this work, we propose a streaming probabilistic classification
model; it uses a set of newly designed features that work incrementally. With
this model, we can have a machine learning classifier that updates itself in
real time with new observations. To test our approach, we simulate a streaming
scenario with light curves from CoRot, OGLE and MACHO catalogs. Results show
that our model achieves high classification performance, staying an order of
magnitude faster than traditional classification approaches.
[5]
oai:arXiv.org:1911.02444 [pdf] - 2026263
An Information Theory Approach on Deciding Spectroscopic Follow Ups
Submitted: 2019-11-06
Classification and characterization of variable phenomena and transient
phenomena are critical for astrophysics and cosmology. These objects are
commonly studied using photometric time series or spectroscopic data. Given
that many ongoing and future surveys are in time-domain and given that adding
spectra provide further insights but requires more observational resources, it
would be valuable to know which objects should we prioritize to have spectrum
in addition to time series. We propose a methodology in a probabilistic setting
that determines a-priory which objects are worth taking spectrum to obtain
better insights, where we focus 'insight' as the type of the object
(classification). Objects for which we query its spectrum are reclassified
using their full spectrum information. We first train two classifiers, one that
uses photometric data and another that uses photometric and spectroscopic data
together. Then for each photometric object we estimate the probability of each
possible spectrum outcome. We combine these models in various probabilistic
frameworks (strategies) which are used to guide the selection of follow up
observations. The best strategy depends on the intended use, whether it is
getting more confidence or accuracy. For a given number of candidate objects
(127, equal to 5% of the dataset) for taking spectra, we improve 37% class
prediction accuracy as opposed to 20% of a non-naive (non-random) best
base-line strategy. Our approach provides a general framework for follow-up
strategies and can be extended beyond classification and to include other forms
of follow-ups beyond spectroscopy.
[6]
oai:arXiv.org:1903.03254 [pdf] - 1846983
An Algorithm for the Visualization of Relevant Patterns in Astronomical
Light Curves
Submitted: 2019-03-07
Within the last years, the classification of variable stars with Machine
Learning has become a mainstream area of research. Recently, visualization of
time series is attracting more attention in data science as a tool to visually
help scientists to recognize significant patterns in complex dynamics. Within
the Machine Learning literature, dictionary-based methods have been widely used
to encode relevant parts of image data. These methods intrinsically assign a
degree of importance to patches in pictures, according to their contribution in
the image reconstruction. Inspired by dictionary-based techniques, we present
an approach that naturally provides the visualization of salient parts in
astronomical light curves, making the analogy between image patches and
relevant pieces in time series. Our approach encodes the most meaningful
patterns such that we can approximately reconstruct light curves by just using
the encoded information. We test our method in light curves from the OGLE-III
and StarLight databases. Our results show that the proposed model delivers an
automatic and intuitive visualization of relevant light curve parts, such as
local peaks and drops in magnitude.
[7]
oai:arXiv.org:1810.09440 [pdf] - 1775748
Deep multi-survey classification of variable stars
Submitted: 2018-10-21
During the last decade, a considerable amount of effort has been made to
classify variable stars using different machine learning techniques. Typically,
light curves are represented as vectors of statistical descriptors or features
that are used to train various algorithms. These features demand big
computational powers that can last from hours to days, making impossible to
create scalable and efficient ways of automatically classifying variable stars.
Also, light curves from different surveys cannot be integrated and analyzed
together when using features, because of observational differences. For
example, having variations in cadence and filters, feature distributions become
biased and require expensive data-calibration models. The vast amount of data
that will be generated soon make necessary to develop scalable machine learning
architectures without expensive integration techniques. Convolutional Neural
Networks have shown impressing results in raw image classification and
representation within the machine learning literature. In this work, we present
a novel Deep Learning model for light curve classification, mainly based on
convolutional units. Our architecture receives as input the differences between
time and magnitude of light curves. It captures the essential classification
patterns regardless of cadence and filter. In addition, we introduce a novel
data augmentation schema for unevenly sampled time series. We test our method
using three different surveys: OGLE-III; Corot; and VVV, which differ in
filters, cadence, and area of the sky. We show that besides the benefit of
scalability, our model obtains state of the art levels accuracy in light curve
classification benchmarks.
[8]
oai:arXiv.org:1802.02575 [pdf] - 1631950
New variable Stars from the Photographic Archive: Semi-automated
Discoveries, Attempts of Automatic Classification, and the New Field 104 Her
Submitted: 2018-02-07
Using 172 plates taken with the 40-cm astrograph of the Sternberg
Astronomical Institute (Lomonosov Moscow University) in 1976-1994 and digitized
with the resolution of 2400 dpi, we discovered and studied 275 new variable
stars. We present the list of our new variables with all necessary information
concerning their brightness variations. As in our earlier studies, the new
discoveries show a rather large number of high-amplitude Delta Scuti variables,
predicting that many stars of this type remain not detected in the whole sky.
We also performed automated classification of the newly discovered variable
stars based on the Random Forest algorithm. The results of the automated
classification were compared to traditional classification and showed that
automated classification was possible even with noisy photographic data.
However, further improvement of automated techniques is needed, which is
especially important having in mind the very large numbers of new discoveries
expected from all-sky surveys.
[9]
oai:arXiv.org:1801.09737 [pdf] - 1626567
Automatic Survey-Invariant Variable Star Classification
Submitted: 2018-01-29
Machine learning techniques have been successfully used to classify variable
stars on widely-studied astronomical surveys. These datasets have been
available to astronomers long enough, thus allowing them to perform deep
analysis over several variable sources and generating useful catalogs with
identified variable stars. The products of these studies are labeled data that
enable supervised learning models to be trained successfully. However, when
these models are blindly applied to data from new sky surveys their performance
drops significantly. Furthermore, unlabeled data becomes available at a much
higher rate than its labeled counterpart, since labeling is a manual and
time-consuming effort. Domain adaptation techniques aim to learn from a domain
where labeled data is available, the \textit{source domain}, and through some
adaptation perform well on a different domain, the \textit{target domain}. We
propose a full probabilistic model that represents the joint distribution of
features from two surveys as well as a probabilistic transformation of the
features between one survey to the other. This allows us to transfer labeled
data to a study where it is not available and to effectively run a variable
star classification model in a new survey. Our model represents the features of
each domain as a Gaussian mixture and models the transformation as a
translation, rotation and scaling of each separate component. We perform tests
using three different variability catalogs: EROS, MACHO, and HiTS, presenting
differences among them, such as the amount of observations per star, cadence,
observational time and optical bands observed, among others.
[10]
oai:arXiv.org:1801.09723 [pdf] - 1626563
Unsupervised Classification of Variable Stars
Submitted: 2018-01-29
During the last ten years, a considerable amount of effort has been made to
develop algorithms for automatic classification of variable stars. That has
been primarily achieved by applying machine learning methods to photometric
datasets where objects are represented as light curves. Classifiers require
training sets to learn the underlying patterns that allow the separation among
classes. Unfortunately, building training sets is an expensive process that
demands a lot of human efforts. Every time data comes from new surveys; the
only available training instances are the ones that have a cross-match with
previously labelled objects, consequently generating insufficient training sets
compared with the large amounts of unlabelled sources. In this work, we present
an algorithm that performs unsupervised classification of variable stars,
relying only on the similarity among light curves. We tackle the unsupervised
classification problem by proposing an untraditional approach. Instead of
trying to match classes of stars with clusters found by a clustering algorithm,
we propose a query based method where astronomers can find groups of variable
stars ranked by similarity. We also develop a fast similarity function specific
for light curves, based on a novel data structure that allows scaling the
search over the entire dataset of unlabelled objects. Experiments show that our
unsupervised model achieves high accuracy in the classification of different
types of variable stars and that the proposed algorithm scales up to massive
amounts of light curves.
[11]
oai:arXiv.org:1801.09732 [pdf] - 1626565
Uncertain classification of Variable Stars: handling observational GAPS
and noise
Submitted: 2018-01-29
Automatic classification methods applied to sky surveys have revolutionized
the astronomical target selection process. Most surveys generate a vast amount
of time series, or \quotes{lightcurves}, that represent the brightness
variability of stellar objects in time. Unfortunately, lightcurves'
observations take several years to be completed, producing truncated time
series that generally remain without the application of automatic classifiers
until they are finished. This happens because state of the art methods rely on
a variety of statistical descriptors or features that present an increasing
degree of dispersion when the number of observations decreases, which reduces
their precision. In this paper we propose a novel method that increases the
performance of automatic classifiers of variable stars by incorporating the
deviations that scarcity of observations produces. Our method uses Gaussian
Process Regression to form a probabilistic model of each lightcurve's
observations. Then, based on this model, bootstrapped samples of the time
series features are generated. Finally a bagging approach is used to improve
the overall performance of the classification. We perform tests on the MACHO
and OGLE catalogs, results show that our method classifies effectively some
variability classes using a small fraction of the original observations. For
example, we found that RR Lyrae stars can be classified with around 80\% of
accuracy just by observing the first 5\% of the whole lightcurves' observations
in MACHO and OGLE catalogs. We believe these results prove that, when studying
lightcurves, it is important to consider the features' error and how the
measurement process impacts it.
[12]
oai:arXiv.org:1602.08977 [pdf] - 1388963
Clustering Based Feature Learning on Variable Stars
Submitted: 2016-02-29
The success of automatic classification of variable stars strongly depends on
the lightcurve representation. Usually, lightcurves are represented as a vector
of many statistical descriptors designed by astronomers called features. These
descriptors commonly demand significant computational power to calculate,
require substantial research effort to develop and do not guarantee good
performance on the final classification task. Today, lightcurve representation
is not entirely automatic; algorithms that extract lightcurve features are
designed by humans and must be manually tuned up for every survey. The vast
amounts of data that will be generated in future surveys like LSST mean
astronomers must develop analysis pipelines that are both scalable and
automated. Recently, substantial efforts have been made in the machine learning
community to develop methods that prescind from expert-designed and manually
tuned features for features that are automatically learned from data. In this
work we present what is, to our knowledge, the first unsupervised feature
learning algorithm designed for variable stars. Our method first extracts a
large number of lightcurve subsequences from a given set of photometric data,
which are then clustered to find common local patterns in the time series.
Representatives of these patterns, called exemplars, are then used to transform
lightcurves of a labeled set into a new representation that can then be used to
train an automatic classifier. The proposed algorithm learns the features from
both labeled and unlabeled lightcurves, overcoming the bias generated when the
learning process is done only with labeled data. We test our method on MACHO
and OGLE datasets; the results show that the classification performance we
achieve is as good and in some cases better than the performance achieved using
traditional features, while the computational cost is significantly lower.
[13]
oai:arXiv.org:1601.03013 [pdf] - 1365579
Meta Classification for Variable Stars
Submitted: 2016-01-12
The need for the development of automatic tools to explore astronomical
databases has been recognized since the inception of CCDs and modern computers.
Astronomers already have developed solutions to tackle several science
problems, such as automatic classification of stellar objects, outlier
detection, and globular clusters identification, among others. New science
problems emerge and it is critical to be able to re-use the models learned
before, without rebuilding everything from the beginning when the science
problem changes. In this paper, we propose a new meta-model that automatically
integrates existing classification models of variable stars. The proposed
meta-model incorporates existing models that are trained in a different
context, answering different questions and using different representations of
data. Conventional mixture of experts algorithms in machine learning literature
can not be used since each expert (model) uses different inputs. We also
consider computational complexity of the model by using the most expensive
models only when it is necessary. We test our model with EROS-2 and MACHO
datasets, and we show that we solve most of the classification challenges only
by training a meta-model to learn how to integrate the previous experts.
[14]
oai:arXiv.org:1506.00010 [pdf] - 1269336
FATS: Feature Analysis for Time Series
Submitted: 2015-05-29, last modified: 2015-08-31
In this paper, we present the FATS (Feature Analysis for Time Series)
library. FATS is a Python library which facilitates and standardizes feature
extraction for time series data. In particular, we focus on one application:
feature extraction for astronomical light curve data, although the library is
generalizable for other uses. We detail the methods and features implemented
for light curve analysis, and present examples for its usage.
[15]
oai:arXiv.org:1405.5298 [pdf] - 1311894
Photometric Classification of quasars from RCS-2 using Random Forest
Carrasco, D.;
Barrientos, L. F.;
Pichara, K.;
Anguita, T.;
Murphy, D. N. A.;
Gilbank, D. G.;
Gladders, M. D.;
Yee, H. K. C.;
Hsieh, B. C.;
López, S.
Submitted: 2014-05-21, last modified: 2015-08-24
Aims. Construction of a new quasar candidate catalog from the Red-Sequence
Cluster Survey 2 (RCS-2), identified solely from photometric information using
an automated algorithm suitable for large surveys. The algorithm performance is
tested using a well-defined SDSS spectroscopic sample of quasars and stars.
Methods. The Random Forest algorithm constructs the catalog from RCS-2 point
sources using SDSS spectroscopically-confirmed stars and quasars. The algorithm
identifies putative quasars from broadband magnitudes (g, r, i, z) and colours.
Exploiting NUV GALEX measurements for a subset of the objects, we refine the
classifier by adding new information. An additional subset of the data with
WISE W1 and W2 bands is also studied. Results. Upon analyzing 542,897 RCS-2
point sources, the algorithm identified 21,501 quasar candidates, with a
training-set-derived precision (the fraction of true positives within the group
assigned quasar status) of 89.5% and recall (the fraction of true positives
relative to all sources that actually are quasars) of 88.4%. These performance
metrics improve for the GALEX subset; 6,530 quasar candidates are identified
from 16,898 sources, with a precision and recall respectively of 97.0% and
97.5%. Algorithm performance is further improved when WISE data are included,
with precision and recall increasing to 99.3% and 99.1% respectively for 21,834
quasar candidates from 242,902 sources. We compile our final catalog (38,257)
by merging these samples and removing duplicates. An observational follow up of
17 bright (r < 19) candidates with long-slit spectroscopy at DuPont telescope
(LCO) yields 14 confirmed quasars. Conclusions. The results signal encouraging
progress in the classification of point sources with Random Forest algorithms
to search for quasars within current and future large-area photometric surveys.
[16]
oai:arXiv.org:1404.4888 [pdf] - 1085209
Supervised detection of anomalous light-curves in massive astronomical
catalogs
Submitted: 2014-04-18, last modified: 2015-05-27
The development of synoptic sky surveys has led to a massive amount of data
for which resources needed for analysis are beyond human capabilities. To
process this information and to extract all possible knowledge, machine
learning techniques become necessary. Here we present a new method to
automatically discover unknown variable objects in large astronomical catalogs.
With the aim of taking full advantage of all the information we have about
known objects, our method is based on a supervised algorithm. In particular, we
train a random forest classifier using known variability classes of objects and
obtain votes for each of the objects in the training set. We then model this
voting distribution with a Bayesian network and obtain the joint voting
distribution among the training objects. Consequently, an unknown object is
considered as an outlier insofar it has a low joint probability. Our method is
suitable for exploring massive datasets given that the training process is
performed offline. We tested our algorithm on 20 millions light-curves from the
MACHO catalog and generated a list of anomalous candidates. We divided the
candidates into two main classes of outliers: artifacts and intrinsic outliers.
Artifacts were principally due to air mass variation, seasonal variation, bad
calibration or instrumental errors and were consequently removed from our
outlier list and added to the training set. After retraining, we selected about
4000 objects, which we passed to a post analysis stage by perfoming a
cross-match with all publicly available catalogs. Within these candidates we
identified certain known but rare objects such as eclipsing Cepheids, blue
variables, cataclysmic variables and X-ray sources. For some outliers there
were no additional information. Among them we identified three unknown
variability types and few individual outliers that will be followed up for a
deeper analysis.
[17]
oai:arXiv.org:1405.4517 [pdf] - 862800
The VVV Templates Project. Towards an Automated Classification of VVV
Light-Curves. I. Building a database of stellar variability in the
near-infrared
Angeloni, R.;
Ramos, R. Contreras;
Catelan, M.;
Dékány, I.;
Gran, F.;
Alonso-García, J.;
Hempel, M.;
Navarrete, C.;
Andrews, H.;
Aparicio, A.;
Beamín, J. C.;
Berger, C.;
Borissova, J.;
Peña, C. Contreras;
Cunial, A.;
de Grijs, R.;
Espinoza, N.;
Eyheramendy, S.;
Lopes, C. E. Ferreira;
Fiaschi, M.;
Hajdu, G.;
Han, J.;
Hełminiak, K. G.;
Hempel, A.;
Hidalgo, S. L.;
Ita, Y.;
Jeon, Y. -B.;
Jordán, A.;
Kwon, J.;
Lee, J. T.;
Martín, E. L.;
Masetti, N.;
Matsunaga, N.;
Milone, A. P.;
Minniti, D.;
Morelli, L.;
Murgas, F.;
Nagayama, T.;
Navarro, C.;
Ochner, P.;
Pérez, P.;
Pichara, K.;
Rojas-Arriagada, A.;
Roquette, J.;
Saito, R. K.;
Siviero, A.;
Sohn, J.;
Sung, H. -I.;
Tamura, M.;
Tata, R.;
Tomasella, L.;
Townsend, B.;
Whitelock, P.
Submitted: 2014-05-18, last modified: 2014-06-03
Context. The Vista Variables in the V\'ia L\'actea (VVV) ESO Public Survey is
a variability survey of the Milky Way bulge and an adjacent section of the disk
carried out from 2010 on ESO Visible and Infrared Survey Telescope for
Astronomy (VISTA). VVV will eventually deliver a deep near-IR atlas with
photometry and positions in five passbands (ZYJHK_S) and a catalogue of 1-10
million variable point sources - mostly unknown - which require
classifications. Aims. The main goal of the VVV Templates Project, that we
introduce in this work, is to develop and test the machine-learning algorithms
for the automated classification of the VVV light-curves. As VVV is the first
massive, multi-epoch survey of stellar variability in the near-infrared, the
template light-curves that are required for training the classification
algorithms are not available. In the first paper of the series we describe the
construction of this comprehensive database of infrared stellar variability.
Methods. First we performed a systematic search in the literature and public
data archives, second, we coordinated a worldwide observational campaign, and
third we exploited the VVV variability database itself on (optically)
well-known stars to gather high-quality infrared light-curves of several
hundreds of variable stars. Results. We have now collected a significant (and
still increasing) number of infrared template light-curves. This database will
be used as a training-set for the machine-learning algorithms that will
automatically classify the light-curves produced by VVV. The results of such an
automated classification will be covered in forthcoming papers of the series.
[18]
oai:arXiv.org:1310.1996 [pdf] - 741931
Stellar Variability in the VVV survey
Catelan, M.;
Minniti, D.;
Lucas, P. W.;
Dékány, I.;
Saito, R. K.;
Angeloni, R.;
Alonso-García, J.;
Hempel, M.;
Helminiak, K.;
Jordán, A.;
Ramos, R. Contreras;
Navarrete, C.;
Beamín, J. C.;
Rojas, A. F.;
Gran, F.;
Lopes, C. E. Ferreira;
Peña, C. Contreras;
Kerins, E.;
Huckvale, L.;
Rejkuba, M.;
Cohen, R.;
Mauro, F.;
Borissova, J.;
Amigo, P.;
Eyheramendy, S.;
Pichara, K.;
Espinoza, N.;
Navarro, C.;
Hajdu, G.;
Espinoza, D. N. Calderón;
Muro, G. A.;
Andrews, H.;
Motta, V.;
Kurtev, R.;
Emerson, J. P.;
Bidin, C. Moni;
Chené, A. -N.
Submitted: 2013-10-07, last modified: 2013-11-04
The Vista Variables in the V\'ia L\'actea (VVV) ESO Public Survey is an
ongoing time-series, near-infrared (IR) survey of the Galactic bulge and an
adjacent portion of the inner disk, covering 562 square degrees of the sky,
using ESO's VISTA telescope. The survey has provided superb multi-color
photometry in 5 broadband filters ($Z$, $Y$, $J$, $H$, and $K_s$), leading to
the best map of the inner Milky Way ever obtained, particularly in the near-IR.
The main variability part of the survey, which is focused on $K_s$-band
observations, is currently underway, with bulge fields having been observed
between 31 and 70 times, and disk fields between 17 and 36 times. When the
survey is complete, bulge (disk) fields will have been observed up to a total
of 100 (60) times, providing unprecedented depth and time coverage. Here we
provide a first overview of stellar variability in the VVV data, including
examples of the light curves that have been collected thus far, scientific
applications, and our efforts towards the automated classification of VVV light
curves.
[19]
oai:arXiv.org:1310.7868 [pdf] - 739149
Automatic Classification of Variable Stars in Catalogs with missing data
Submitted: 2013-10-29
We present an automatic classification method for astronomical catalogs with
missing data. We use Bayesian networks, a probabilistic graphical model, that
allows us to perform inference to pre- dict missing values given observed data
and dependency relationships between variables. To learn a Bayesian network
from incomplete data, we use an iterative algorithm that utilises sampling
methods and expectation maximization to estimate the distributions and
probabilistic dependencies of variables from data with missing values. To test
our model we use three catalogs with missing data (SAGE, 2MASS and UBVI) and
one complete catalog (MACHO). We examine how classification accuracy changes
when information from missing data catalogs is included, how our method
compares to traditional missing data approaches and at what computational cost.
Integrating these catalogs with missing data we find that classification of
variable objects improves by few percent and by 15% for quasar detection while
keeping the computational cost the same.
[20]
oai:arXiv.org:1304.0401 [pdf] - 646062
An improved quasar detection method in EROS-2 and MACHO LMC datasets
Submitted: 2013-04-01
We present a new classification method for quasar identification in the
EROS-2 and MACHO datasets based on a boosted version of Random Forest
classifier. We use a set of variability features including parameters of a
continuous auto regressive model. We prove that continuous auto regressive
parameters are very important discriminators in the classification process. We
create two training sets (one for EROS-2 and one for MACHO datasets) using
known quasars found in the LMC. Our model's accuracy in both EROS-2 and MACHO
training sets is about 90% precision and 86% recall, improving the state of the
art models accuracy in quasar detection. We apply the model on the complete,
including 28 million objects, EROS-2 and MACHO LMC datasets, finding 1160 and
2551 candidates respectively. To further validate our list of candidates, we
crossmatched our list with a previous 663 known strong candidates, getting 74%
of matches for MACHO and 40% in EROS-2. The main difference on matching level
is because EROS-2 is a slightly shallower survey which translates to
significantly lower signal-to-noise ratio lightcurves.
[21]
oai:arXiv.org:1105.1119 [pdf] - 369640
The Vista Variables in the Via Lactea (VVV) ESO Public Survey: Current
Status and First Results
Catelan, M.;
Minniti, D.;
Lucas, P. W.;
Alonso-Garcia, J.;
Angeloni, R.;
Beamin, J. C.;
Bonatto, C.;
Borissova, J.;
Contreras, C.;
Cross, N.;
Dekany, I.;
Emerson, J. P.;
Eyheramendy, S.;
Geisler, D.;
Gonzalez-Solares, E.;
Helminiak, K. G.;
Hempel, M.;
Irwin, M. J.;
Ivanov, V. D.;
Jordan, A.;
Kerins, E.;
Kurtev, R.;
Mauro, F.;
Bidin, C. Moni;
Navarrete, C.;
Perez, P.;
Pichara, K.;
Read, M.;
Rejkuba, M.;
Saito, R. K.;
Sale, S. E.;
Toledo, I.
Submitted: 2011-05-05, last modified: 2011-06-07
Vista Variables in the Via Lactea (VVV) is an ESO Public Survey that is
performing a variability survey of the Galactic bulge and part of the inner
disk using ESO's Visible and Infrared Survey Telescope for Astronomy (VISTA).
The survey covers 520 deg^2 of sky area in the ZYJHK_S filters, for a total
observing time of 1929 hours, including ~ 10^9 point sources and an estimated ~
10^6 variable stars. Here we describe the current status of the VVV Survey, in
addition to a variety of new results based on VVV data, including light curves
for variable stars, newly discovered globular clusters, open clusters, and
associations. A set of reddening-free indices based on the ZYJHK_S system is
also introduced. Finally, we provide an overview of the VVV Templates Project,
whose main goal is to derive well-defined light curve templates in the near-IR,
for the automated classification of VVV light curves.