Normalized to: Vergara, J.
[1]
oai:arXiv.org:2002.10464 [pdf] - 2131872
Dimensionality Reduction of SDSS Spectra with Variational Autoencoders
Submitted: 2020-02-24, last modified: 2020-07-09
High resolution galaxy spectra contain much information about galactic
physics, but the high dimensionality of these spectra makes it difficult to
fully utilize the information they contain. We apply variational autoencoders
(VAEs), a non-linear dimensionality reduction technique, to a sample of spectra
from the Sloan Digital Sky Survey. In contrast to Principal Component Analysis
(PCA), a widely used technique, VAEs can capture non-linear relationships
between latent parameters and the data. We find that a VAE can reconstruct the
SDSS spectra well with only six latent parameters, outperforming PCA with the
same number of components. Different galaxy classes are naturally separated in
this latent space, without class labels having been given to the VAE. The VAE
latent space is interpretable because the VAE can be used to make synthetic
spectra at any point in latent space. For example, making synthetic spectra
along tracks in latent space yields sequences of realistic spectra that
interpolate between two different types of galaxies. Using the latent space to
find outliers may yield interesting spectra: in our small sample, we
immediately find unusual data artifacts and stars misclassified as galaxies. In
this exploratory work, we show that VAEs create compact, interpretable latent
spaces that capture non-linear features of the data. While a VAE takes
substantial time to train (~1 day for 48000 spectra), once trained, VAEs can
enable the fast exploration of large astronomical data sets.