Normalized to: Shakurova, K.
[1]
oai:arXiv.org:1612.07549 [pdf] - 1533780
Identification of Artifacts and Interesting Celestial Objects in LAMOST
Spectral Survey
Submitted: 2016-12-22
The LAMOST DR1 survey contains about two million of spectra labelled by its
pipeline as stellar objects of common spectral classes. There is, however, a
lot of spectra corrupted in some way by both instrumental and processing
artifacts, which may mimic spectral properties of interesting celestial
objects, namely emission lines of Be stars and quasars. We have tested several
clustering methods as well as outliers analysis on a sample of one hundred
thousand spectra using Spark scripts running on Hadoop cluster consisting of
twenty-four sixteen-core nodes. This experiment was motivated by an attempt to
find rare objects with interesting spectra as outliers most dissimilar from all
common spectra. The result of this time-consuming procedure is a list of
several hundred candidates where different artifacts are prominent, but also
tens of very interesting emission-line spectra requiring further detailed
examination. Many of them may be quasars or even blazars as well as yet unknown
Be-stars. It deserves mentioning that most of the work benefitted considerably
from technologies of Virtual Observatory.
[2]
oai:arXiv.org:1612.07536 [pdf] - 1580989
Identification of Interesting Objects in Large Spectral Surveys Using
Highly Parallelized Machine Learning
Submitted: 2016-12-22
The current archives of LAMOST multi-object spectrograph contain millions of
fully reduced spectra, from which the automatic pipelines have produced
catalogues of many parameters of individual objects, including their
approximate spectral classification. This is, however, mostly based on the
global shape of the whole spectrum and on integral properties of spectra in
given bandpasses, namely presence and equivalent width of prominent spectral
lines, while for identification of some interesting object types (e.g. Be stars
or quasars) the detailed shape of only a few lines is crucial. Here the machine
learning is bringing a new methodology capable of improving the reliability of
classification of such objects even in boundary cases.
We present results of Spark-based semi-supervised machine learning of LAMOST
spectra attempting to automatically identify the single and double-peak
emission of H alpha line typical for Be and B[e] stars. The labelled sample was
obtained from archive of 2m Perek telescope at Ond\v{r}ejov observatory. A
simple physical model of spectrograph resolution was used in domain adaptation
to LAMOST training domain. The resulting list of candidates contains dozens of
Be stars (some are likely yet unknown), but also a bunch of interesting objects
resembling spectra of quasars and even blazars, as well as many instrumental
artefacts. The verification of a nature of interesting candidates benefited
considerably from cross-matching and visualisation in the Virtual Observatory
environment.