Full-text search for arXiv

Yi, K. S.

Normalized to: Yi, K.

6 article(s) in total. 35 co-authors, from 1 to 4 common article(s). Median position in authors list is 4,5.

[1] oai:arXiv.org:2007.03109 [pdf] - 2129514

Cycle-StarNet: Bridging the gap between theory and data by leveraging large datasets

O'Briain, Teaghan; Ting, Yuan-Sen; Fabbro, Sébastien; Yi, Kwang M.; Venn, Kim; Bialek, Spencer

Comments: 20 pages, 11 figures, 1 table, submitted ApJ. A companion 4-page preview is accepted to the ICML 2020 Machine Learning Interpretability for Scientific Discovery workshop. The code used in this study is made publicly available on github: https://github.com/teaghan/Cycle_SN

Submitted: 2020-07-06

Spectroscopy provides an immense amount of information on stellar objects, and this field continues to grow with recent developments in multi-object data acquisition and rapid data analysis techniques. Current automated methods for analyzing spectra are either (a) data-driven models, which require large amounts of data with prior knowledge of stellar parameters and elemental abundances, or (b) based on theoretical synthetic models that are susceptible to the gap between theory and practice. In this study, we present a hybrid generative domain adaptation method to turn simulated stellar spectra into realistic spectra, learning from the large spectroscopic surveys. We use a neural network to emulate computationally expensive stellar spectra simulations, and then train a separate unsupervised domain-adaptation network that learns to relate the generated synthetic spectra to observational spectra. Consequently, the network essentially produces data-driven models without the need for a labeled training set. As a proof of concept, two case studies are presented. The first of which is the auto-calibration of synthetic models without using any standard stars. To accomplish this, synthetic models are morphed into spectra that resemble observations, thereby reducing the gap between theory and observations. The second case study is the identification of the elemental source of missing spectral lines in the synthetic modelling. These sources are predicted by interpreting the differences between the domain-adapted and original spectral models. To test our ability to identify missing lines, we use a mock dataset and show that, even with noisy observations, absorption lines can be recovered when they are absent in one of the domains. While we focus on spectral analyses in this study, this method can be applied to other fields, which use large data sets and are currently limited by modelling accuracy.

[2] oai:arXiv.org:2007.03112 [pdf] - 2129515

Interpreting Stellar Spectra with Unsupervised Domain Adaptation

O'Briain, Teaghan; Ting, Yuan-Sen; Fabbro, Sébastien; Yi, Kwang M.; Venn, Kim; Bialek, Spencer

Comments: 4 pages, 4 figure, accepted to the ICML 2020 Machine Learning Interpretability for Scientific Discovery workshop. A full 20-page version is submitted to ApJ. The code used in this study is made publicly available on github: https://github.com/teaghan/Cycle_SN

Submitted: 2020-07-06

We discuss how to achieve mapping from large sets of imperfect simulations and observational data with unsupervised domain adaptation. Under the hypothesis that simulated and observed data distributions share a common underlying representation, we show how it is possible to transfer between simulated and observed domains. Driven by an application to interpret stellar spectroscopic sky surveys, we construct the domain transfer pipeline from two adversarial autoencoders on each domains with a disentangling latent space, and a cycle-consistency constraint. We then construct a differentiable pipeline from physical stellar parameters to realistic observed spectra, aided by a supplementary generative surrogate physics emulator network. We further exemplify the potential of the method on the reconstructed spectra quality and to discover new spectral features associated to elemental abundances.

[3] oai:arXiv.org:2001.11651 [pdf] - 2040800

CosmoVAE: Variational Autoencoder for CMB Image Inpainting

Yi, Kai; Guo, Yi; Fan, Yanan; Hamann, Jan; Wang, Yu Guang

Comments: 7 pages, 6 figures

Submitted: 2020-01-30

Cosmic microwave background radiation (CMB) is critical to the understanding of the early universe and precise estimation of cosmological constants. Due to the contamination of thermal dust noise in the galaxy, the CMB map that is an image on the two-dimensional sphere has missing observations, mainly concentrated on the equatorial region. The noise of the CMB map has a significant impact on the estimation precision for cosmological parameters. Inpainting the CMB map can effectively reduce the uncertainty of parametric estimation. In this paper, we propose a deep learning-based variational autoencoder --- CosmoVAE, to restoring the missing observations of the CMB map. The input and output of CosmoVAE are square images. To generate training, validation, and test data sets, we segment the full-sky CMB map into many small images by Cartesian projection. CosmoVAE assigns physical quantities to the parameters of the VAE network by using the angular power spectrum of the Gaussian random field as latent variables. CosmoVAE adopts a new loss function to improve the learning performance of the model, which consists of $\ell_1$ reconstruction loss, Kullback-Leibler divergence between the posterior distribution of encoder network and the prior distribution of latent variables, perceptual loss, and total-variation regularizer. The proposed model achieves state of the art performance for Planck \texttt{Commander} 2018 CMB map inpainting.

[4] oai:arXiv.org:1911.02602 [pdf] - 1995525

Deep learning analyses of synthetic spectral libraries with an application to the Gaia-ESO database

Bialek, Spencer; Fabbro, Sébastien; Venn, Kim A.; Kumar, Nripesh; O'Briain, Teaghan; Yi, Kwang Moo

Comments: 16 pages, 15 figures, submitted to MNRAS

Submitted: 2019-11-06

In the era of stellar spectroscopic surveys, synthetic spectral libraries will form the basis for the derivation of the stellar parameters and chemical abundances. In this paper, four popular synthetic grids (INTRIGOSS, FERRE, AMBRE, and PHOENIX) are used in our deep learning prediction framework (StarNet), and compared in an application to optical spectra from the Gaia-ESO survey. The stellar parameters for temperature, surface gravity, metallicity, radial velocity, rotational velocity, and [{\alpha}/Fe] are determined simultaneously for FGK type dwarfs and giants. StarNet was modified to mitigate the differences in the sampling between the synthetic grids and the observed spectra, by augmenting the grids with realistic observational signatures, in an attempt to incorporate both modelling and statistical uncertainties as part of the training. When applied to spectra from the Gaia-ESO spectroscopic survey and the Gaia-ESO benchmark stars, the INTRIGOSS-trained StarNet showed the best results with the least scatter. Training with the FERRE synthetic grid produces similarly accurate predictions (followed closely by the AMBRE grid), but over a wider range in stellar parameters and spectroscopic wavelengths . In the future, improvements in the underlying physics that generates these synthetic grids will be necessary for consistent high precision stellar parameters and chemical abundances from machine learning and other sophisticated data analysis tools.

[5] oai:arXiv.org:1910.00774 [pdf] - 2097211

LRP2020: Machine Learning Advantages in Canadian Astrophysics

Venn, K. A.; Fabbro, S.; Liu, A; Hezaveh, Y.; Perreault-Levasseur, L.; Eadie, G.; Ellison, S.; Woo, J.; Kavelaars, JJ.; Yi, K. M.; Hlozek, R.; Bovy, J.; Teimoorinia, H.; Ravanbakhsh, S.; Spencer, L.

Comments: White paper E015 submitted to the Canadian Long Range Plan LRP2020

Submitted: 2019-10-02, last modified: 2019-10-15

The application of machine learning (ML) methods to the analysis of astrophysical datasets is on the rise, particularly as the computing power and complex algorithms become more powerful and accessible. As the field of ML enjoys a continuous stream of breakthroughs, its applications demonstrate the great potential of ML, ranging from achieving tens of millions of times increase in analysis speed (e.g., modeling of gravitational lenses or analysing spectroscopic surveys) to solutions of previously unsolved problems (e.g., foreground subtraction or efficient telescope operations). The number of astronomical publications that include ML has been steadily increasing since 2010. With the advent of extremely large datasets from a new generation of surveys in the 2020s, ML methods will become an indispensable tool in astrophysics. Canada is an unambiguous world leader in the development of the field of machine learning, attracting large investments and skilled researchers to its prestigious AI Research Institutions. This provides a unique opportunity for Canada to also be a world leader in the application of machine learning in the field of astrophysics, and foster the training of a new generation of highly skilled researchers.

[6] oai:arXiv.org:astro-ph/0507169 [pdf] - 74338

UV properties of early-type galaxies in the Virgo cluster

Boselli, A.; Cortese, L.; Deharveng, J. M.; Gavazzi, G.; Yi, K. S.; de Paz, A. Gil; Seibert, M.; Boissier, S.; Donas, J.; Lee, Y. -W.; Madore, B. F.; Martin, D. C.; Rich, R. M.; Sohn, Y. -J.

Comments: 5 pages, 2 figures, 1 table. Accepted for publication in Astrophysical Journal Letters

Submitted: 2005-07-07, last modified: 2005-07-25

We study the UV properties of a volume limited sample of early-type galaxies in the Virgo cluster combining new GALEX far- (1530 A) and near-ultraviolet (2310 A) data with spectro-photometric data available at other wavelengths. The sample includes 264 ellipticals, lenticulars and dwarfs spanning a large range in luminosity (M(B)<-15). While the NUV to optical or near-IR color magnitude relations (CMR) are similar to those observed at optical wavelengths, with a monotonic reddening of the color index with increasing luminosity, the (FUV-V) and (FUV-H) CMRs show a discontinuity between massive and dwarf objects. An even more pronounced dichotomy is observed in the (FUV-NUV) CMR. For ellipticals the (FUV-NUV) color becomes bluer with increasing luminosity and with increasing reddening of the optical or near-IR color indices. For the dwarfs the opposite trend is observed. These observational evidences are consistent with the idea that the UV emission is dominated by hot, evolved stars in giant systems, while in dwarf ellipticals residual star formation activity is more common.