Normalized to: Lucie-Smith, L.
[1]
oai:arXiv.org:1911.02479 [pdf] - 1994455
Algorithms and Statistical Models for Scientific Discovery in the
Petabyte Era
Nord, Brian;
Connolly, Andrew J.;
Kinney, Jamie;
Kubica, Jeremy;
Narayan, Gautaum;
Peek, Joshua E. G.;
Schafer, Chad;
Tollerud, Erik J.;
Avestruz, Camille;
Babu, G. Jogesh;
Birrer, Simon;
Burke, Douglas;
Caldeira, João;
Caldwell, Douglas A.;
Carlberg, Joleen K.;
Chen, Yen-Chi;
Dong, Chuanfei;
Feigelson, Eric D.;
Golkhou, V. Zach;
Kashyap, Vinay;
Li, T. S.;
Loredo, Thomas;
Lucie-Smith, Luisa;
Mandel, Kaisey S.;
Martínez-Galarza, J. R.;
Miller, Adam A.;
Natarajan, Priyamvada;
Ntampaka, Michelle;
Ptak, Andy;
Rapetti, David;
Shamir, Lior;
Siemiginowska, Aneta;
Sipőcz, Brigitta M.;
Smith, Arfon M.;
Tran, Nhan;
Vilalta, Ricardo;
Walkowicz, Lucianne M.;
ZuHone, John
Submitted: 2019-11-04
The field of astronomy has arrived at a turning point in terms of size and
complexity of both datasets and scientific collaboration. Commensurately,
algorithms and statistical models have begun to adapt --- e.g., via the onset
of artificial intelligence --- which itself presents new challenges and
opportunities for growth. This white paper aims to offer guidance and ideas for
how we can evolve our technical and collaborative frameworks to promote
efficient algorithmic development and take advantage of opportunities for
scientific discovery in the petabyte era. We discuss challenges for discovery
in large and complex data sets; challenges and requirements for the next stage
of development of statistical methodologies and algorithmic tool sets; how we
might change our paradigms of collaboration and education; and the ethical
implications of scientists' contributions to widely applicable algorithms and
computational modeling. We start with six distinct recommendations that are
supported by the commentary following them. This white paper is related to a
larger corpus of effort that has taken place within and around the Petabytes to
Science Workshops (https://petabytestoscience.github.io/).
[2]
oai:arXiv.org:1906.06339 [pdf] - 1964161
An interpretable machine learning framework for dark matter halo
formation
Submitted: 2019-06-14, last modified: 2019-09-19
We present a generalization of our recently proposed machine learning
framework, aiming to provide new physical insights into dark matter halo
formation. We investigate the impact of the initial density and tidal shear
fields on the formation of haloes over the mass range $11.4 \leq
\log(M/M_{\odot}) \leq 13.4$. The algorithm is trained on an N-body simulation
to infer the final mass of the halo to which each dark matter particle will
later belong. We then quantify the difference in the predictive accuracy
between machine learning models using a metric based on the Kullback-Leibler
divergence. We first train the algorithm with information about the density
contrast in the particles' local environment. The addition of tidal shear
information does not yield an improved halo collapse model over one based on
density information alone; the difference in their predictive performance is
consistent with the statistical uncertainty of the density-only based model.
This implies that our machine learning setup does not identify any significant
role for the tidal shear in determining halo masses. This result is confirmed
as we verify the ability of the initial conditions-to-halo mass mapping learnt
from one simulation to generalize to independent simulations. Our work
illustrates the broader potential of developing interpretable machine learning
frameworks to gain physical understanding of non-linear large-scale structure
formation.
[3]
oai:arXiv.org:1802.04271 [pdf] - 1707683
Machine learning cosmological structure formation
Submitted: 2018-02-12, last modified: 2018-06-29
We train a machine learning algorithm to learn cosmological structure
formation from N-body simulations. The algorithm infers the relationship
between the initial conditions and the final dark matter haloes, without the
need to introduce approximate halo collapse models. We gain insights into the
physics driving halo formation by evaluating the predictive performance of the
algorithm when provided with different types of information about the local
environment around dark matter particles. The algorithm learns to predict
whether or not dark matter particles will end up in haloes of a given mass
range, based on spherical overdensities. We show that the resulting predictions
match those of spherical collapse approximations such as extended
Press-Schechter theory. Additional information on the shape of the local
gravitational potential is not able to improve halo collapse predictions; the
linear density field contains sufficient information for the algorithm to also
reproduce ellipsoidal collapse predictions based on the Sheth-Tormen model. We
investigate the algorithm's performance in terms of halo mass and radial
position and perform blind analyses on independent initial conditions
realisations to demonstrate the generality of our results.