Normalized to: Prabhat.
[1]
oai:arXiv.org:1803.00113 [pdf] - 1865287
Approximate Inference for Constructing Astronomical Catalogs from Images
Submitted: 2018-02-28, last modified: 2019-04-09
We present a new, fully generative model for constructing astronomical
catalogs from optical telescope image sets. Each pixel intensity is treated as
a random variable with parameters that depend on the latent properties of stars
and galaxies. These latent properties are themselves modeled as random. We
compare two procedures for posterior inference. One procedure is based on
Markov chain Monte Carlo (MCMC) while the other is based on variational
inference (VI). The MCMC procedure excels at quantifying uncertainty, while the
VI procedure is 1000 times faster. On a supercomputer, the VI procedure
efficiently uses 665,000 CPU cores to construct an astronomical catalog from 50
terabytes of images in 14.6 minutes, demonstrating the scaling characteristics
necessary to construct catalogs for upcoming astronomical surveys.
[2]
oai:arXiv.org:1808.04728 [pdf] - 1782849
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Mathuriya, Amrita;
Bard, Deborah;
Mendygral, Peter;
Meadows, Lawrence;
Arnemann, James;
Shao, Lei;
He, Siyu;
Karna, Tuomas;
Moise, Daina;
Pennycook, Simon J.;
Maschoff, Kristyn;
Sewall, Jason;
Kumar, Nalini;
Ho, Shirley;
Ringenburg, Mike;
Prabhat;
Lee, Victor
Submitted: 2018-08-14, last modified: 2018-11-09
Deep learning is a promising tool to determine the physical model that
describes our universe. To handle the considerable computational cost of this
problem, we present CosmoFlow: a highly scalable deep learning application
built on top of the TensorFlow framework. CosmoFlow uses efficient
implementations of 3D convolution and pooling primitives, together with
improvements in threading for many element-wise operations, to improve training
performance on Intel(C) Xeon Phi(TM) processors. We also utilize the Cray PE
Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate
fully synchronous data-parallel training on 8192 nodes of Cori with 77%
parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our
knowledge, this is the first large-scale science application of the TensorFlow
framework at supercomputer scale with fully-synchronous training. These
enhancements enable us to process large 3D dark matter distribution and predict
the cosmological parameters $\Omega_M$, $\sigma_8$ and n$_s$ with unprecedented
accuracy.
[3]
oai:arXiv.org:1809.06166 [pdf] - 1751139
Graph Neural Networks for IceCube Signal Classification
Submitted: 2018-09-17
Tasks involving the analysis of geometric (graph- and manifold-structured)
data have recently gained prominence in the machine learning community, giving
birth to a rapidly developing field of geometric deep learning. In this work,
we leverage graph neural networks to improve signal detection in the IceCube
neutrino observatory. The IceCube detector array is modeled as a graph, where
vertices are sensors and edges are a learned function of the sensors' spatial
coordinates. As only a subset of IceCube's sensors is active during a given
observation, we note the adaptive nature of our GNN, wherein computation is
restricted to the input signal support. We demonstrate the effectiveness of our
GNN architecture on a task classifying IceCube events, where it outperforms
both a traditional physics-based method as well as classical 3D convolution
neural networks.
[4]
oai:arXiv.org:1801.10277 [pdf] - 1627837
Cataloging the Visible Universe through Bayesian Inference at Petascale
Regier, Jeffrey;
Pamnany, Kiran;
Fischer, Keno;
Noack, Andreas;
Lam, Maximilian;
Revels, Jarrett;
Howard, Steve;
Giordano, Ryan;
Schlegel, David;
McAuliffe, Jon;
Thomas, Rollin;
Prabhat
Submitted: 2018-01-30
Astronomical catalogs derived from wide-field imaging surveys are an
important tool for understanding the Universe. We construct an astronomical
catalog from 55 TB of imaging data using Celeste, a Bayesian variational
inference code written entirely in the high-productivity programming language
Julia. Using over 1.3 million threads on 650,000 Intel Xeon Phi cores of the
Cori Phase II supercomputer, Celeste achieves a peak rate of 1.54 DP PFLOP/s.
Celeste is able to jointly optimize parameters for 188M stars and galaxies,
loading and processing 178 TB across 8192 nodes in 14.6 minutes. To achieve
this, Celeste exploits parallelism at multiple levels (cluster, node, and
thread) and accelerates I/O through Cori's Burst Buffer. Julia's native
performance enables Celeste to employ high-level constructs without resorting
to hand-written or generated low-level code (C/C++/Fortran), and yet achieve
petascale performance.
[5]
oai:arXiv.org:1709.00086 [pdf] - 1587773
Galactos: Computing the Anisotropic 3-Point Correlation Function for 2
Billion Galaxies
Friesen, Brian;
Patwary, Md. Mostofa Ali;
Austin, Brian;
Satish, Nadathur;
Slepian, Zachary;
Sundaram, Narayanan;
Bard, Deborah;
Eisenstein, Daniel J;
Deslippe, Jack;
Dubey, Pradeep;
Prabhat
Submitted: 2017-08-31
The nature of dark energy and the complete theory of gravity are two central
questions currently facing cosmology. A vital tool for addressing them is the
3-point correlation function (3PCF), which probes deviations from a spatially
random distribution of galaxies. However, the 3PCF's formidable computational
expense has prevented its application to astronomical surveys comprising
millions to billions of galaxies. We present Galactos, a high-performance
implementation of a novel, O(N^2) algorithm that uses a load-balanced k-d tree
and spherical harmonic expansions to compute the anisotropic 3PCF. Our
implementation is optimized for the Intel Xeon Phi architecture, exploiting
SIMD parallelism, instruction and thread concurrency, and significant L1 and L2
cache reuse, reaching 39% of peak performance on a single node. Galactos scales
to the full Cori system, achieving 9.8PF (peak) and 5.06PF (sustained) across
9636 nodes, making the 3PCF easily computable for all galaxies in the
observable universe.
[6]
oai:arXiv.org:1611.03404 [pdf] - 1511937
Learning an Astronomical Catalog of the Visible Universe through
Scalable Bayesian Inference
Submitted: 2016-11-10
Celeste is a procedure for inferring astronomical catalogs that attains
state-of-the-art scientific results. To date, Celeste has been scaled to at
most hundreds of megabytes of astronomical images: Bayesian posterior inference
is notoriously demanding computationally. In this paper, we report on a
scalable, parallel version of Celeste, suitable for learning catalogs from
modern large-scale astronomical datasets. Our algorithmic innovations include a
fast numerical optimization routine for Bayesian posterior inference and a
statistically efficient scheme for decomposing astronomical optimization
problems into subproblems.
Our scalable implementation is written entirely in Julia, a new high-level
dynamic programming language designed for scientific and numerical computing.
We use Julia's high-level constructs for shared and distributed memory
parallelism, and demonstrate effective load balancing and efficient scaling on
up to 8192 Xeon cores on the NERSC Cori supercomputer.
[7]
oai:arXiv.org:1506.01351 [pdf] - 1120672
Celeste: Variational inference for a generative model of astronomical
images
Submitted: 2015-06-03
We present a new, fully generative model of optical telescope image sets,
along with a variational procedure for inference. Each pixel intensity is
treated as a Poisson random variable, with a rate parameter dependent on latent
properties of stars and galaxies. Key latent properties are themselves random,
with scientific prior distributions constructed from large ancillary data sets.
We check our approach on synthetic images. We also run it on images from a
major sky survey, where it exceeds the performance of the current
state-of-the-art method for locating celestial bodies and measuring their
colors.