Normalized to: Suárez-Pérez, J.
[1]
oai:arXiv.org:2006.13163 [pdf] - 2124947
MANTRA: A Machine Learning reference lightcurve dataset for astronomical
transient event recognition
Submitted: 2020-06-23, last modified: 2020-06-30
We introduce MANTRA, an annotated dataset of 4869 transient and 71207
non-transient object lightcurves built from the Catalina Real Time Transient
Survey. We provide public access to this dataset as a plain text file to
facilitate standardized quantitative comparison of astronomical transient event
recognition algorithms. Some of the classes included in the dataset are:
supernovae, cataclysmic variables, active galactic nuclei, high proper motion
stars, blazars and flares. As an example of the tasks that can be performed on
the dataset we experiment with multiple data pre-processing methods, feature
selection techniques and popular machine learning algorithms (Support Vector
Machines, Random Forests and Neural Networks). We assess quantitative
performance in two classification tasks: binary (transient/non-transient) and
eight-class classification. The best performing algorithm in both tasks is the
Random Forest Classifier. It achieves an F1-score of 96.25% in the binary
classification and 52.79% in the eight-class classification. For the
eight-class classification, non-transients ( 96.83% ) is the class with the
highest F1-score, while the lowest corresponds to high-proper-motion stars (
16.79% ); for supernovae it achieves a value of 54.57% , close to the average
across classes. The next release of MANTRA includes images and benchmarks with
deep learning models.