Normalized to: Kittara, P.
[1]
oai:arXiv.org:1909.00718 [pdf] - 2080872
Optimizing exoplanet atmosphere retrieval using unsupervised
machine-learning classification
Submitted: 2019-09-02, last modified: 2020-04-01
One of the principal bottlenecks to atmosphere characterisation in the era of
all-sky surveys is the availability of fast, autonomous and robust atmospheric
retrieval methods. We present a new approach using unsupervised machine
learning to generate informed priors for retrieval of exoplanetary atmosphere
parameters from transmission spectra. We use principal component analysis (PCA)
to efficiently compress the information content of a library of transmission
spectra forward models generated using the PLATON package. We then apply a
$k$-means clustering algorithm in PCA space to segregate the library into
discrete classes. We show that our classifier is almost always able to
instantaneously place a previously unseen spectrum into the correct class, for
low-to-moderate spectral resolutions, $R$, in the range $R~=~30-300$ and noise
levels up to $10$~per~cent of the peak-to-trough spectrum amplitude. The
distribution of physical parameters for all members of the class therefore
provides an informed prior for standard retrieval methods such as nested
sampling. We benchmark our informed-prior approach against a standard
uniform-prior nested sampler, finding that our approach is up to a factor two
faster, with negligible reduction in accuracy. We demonstrate the application
of this method to existing and near-future observatories, and show that it is
suitable for real-world application. Our general approach is not specific to
transmission spectroscopy and should be more widely applicable to cases that
involve repetitive fitting of trusted high-dimensional models to large data
catalogues, including beyond exoplanetary science.