Full-text search for arXiv

Prabhat

Normalized to: Prabhat.

7 article(s) in total. 49 co-authors, from 1 to 4 common article(s). Median position in authors list is 8,0.

[1] oai:arXiv.org:1803.00113 [pdf] - 1865287

Approximate Inference for Constructing Astronomical Catalogs from Images

Regier, Jeffrey; Miller, Andrew C.; Schlegel, David; Adams, Ryan P.; McAuliffe, Jon D.; Prabhat

Comments: accepted to the Annals of Applied Statistics

Submitted: 2018-02-28, last modified: 2019-04-09

We present a new, fully generative model for constructing astronomical catalogs from optical telescope image sets. Each pixel intensity is treated as a random variable with parameters that depend on the latent properties of stars and galaxies. These latent properties are themselves modeled as random. We compare two procedures for posterior inference. One procedure is based on Markov chain Monte Carlo (MCMC) while the other is based on variational inference (VI). The MCMC procedure excels at quantifying uncertainty, while the VI procedure is 1000 times faster. On a supercomputer, the VI procedure efficiently uses 665,000 CPU cores to construct an astronomical catalog from 50 terabytes of images in 14.6 minutes, demonstrating the scaling characteristics necessary to construct catalogs for upcoming astronomical surveys.

[2] oai:arXiv.org:1808.04728 [pdf] - 1782849

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Mathuriya, Amrita; Bard, Deborah; Mendygral, Peter; Meadows, Lawrence; Arnemann, James; Shao, Lei; He, Siyu; Karna, Tuomas; Moise, Daina; Pennycook, Simon J.; Maschoff, Kristyn; Sewall, Jason; Kumar, Nalini; Ho, Shirley; Ringenburg, Mike; Prabhat; Lee, Victor

Comments: 11 pages, 6 pages, presented at SuperComputing 2018

Submitted: 2018-08-14, last modified: 2018-11-09

Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel(C) Xeon Phi(TM) processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. These enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters $\Omega_M$, $\sigma_8$ and n$_s$ with unprecedented accuracy.

[3] oai:arXiv.org:1809.06166 [pdf] - 1751139

Graph Neural Networks for IceCube Signal Classification

Choma, Nicholas; Monti, Federico; Gerhardt, Lisa; Palczewski, Tomasz; Ronaghi, Zahra; Prabhat; Bhimji, Wahid; Bronstein, Michael M.; Klein, Spencer R.; Bruna, Joan

Comments:

Submitted: 2018-09-17

Tasks involving the analysis of geometric (graph- and manifold-structured) data have recently gained prominence in the machine learning community, giving birth to a rapidly developing field of geometric deep learning. In this work, we leverage graph neural networks to improve signal detection in the IceCube neutrino observatory. The IceCube detector array is modeled as a graph, where vertices are sensors and edges are a learned function of the sensors' spatial coordinates. As only a subset of IceCube's sensors is active during a given observation, we note the adaptive nature of our GNN, wherein computation is restricted to the input signal support. We demonstrate the effectiveness of our GNN architecture on a task classifying IceCube events, where it outperforms both a traditional physics-based method as well as classical 3D convolution neural networks.

[4] oai:arXiv.org:1801.10277 [pdf] - 1627837

Cataloging the Visible Universe through Bayesian Inference at Petascale

Regier, Jeffrey; Pamnany, Kiran; Fischer, Keno; Noack, Andreas; Lam, Maximilian; Revels, Jarrett; Howard, Steve; Giordano, Ryan; Schlegel, David; McAuliffe, Jon; Thomas, Rollin; Prabhat

Comments: accepted to IPDPS 2018

Submitted: 2018-01-30

Astronomical catalogs derived from wide-field imaging surveys are an important tool for understanding the Universe. We construct an astronomical catalog from 55 TB of imaging data using Celeste, a Bayesian variational inference code written entirely in the high-productivity programming language Julia. Using over 1.3 million threads on 650,000 Intel Xeon Phi cores of the Cori Phase II supercomputer, Celeste achieves a peak rate of 1.54 DP PFLOP/s. Celeste is able to jointly optimize parameters for 188M stars and galaxies, loading and processing 178 TB across 8192 nodes in 14.6 minutes. To achieve this, Celeste exploits parallelism at multiple levels (cluster, node, and thread) and accelerates I/O through Cori's Burst Buffer. Julia's native performance enables Celeste to employ high-level constructs without resorting to hand-written or generated low-level code (C/C++/Fortran), and yet achieve petascale performance.

[5] oai:arXiv.org:1709.00086 [pdf] - 1587773

Galactos: Computing the Anisotropic 3-Point Correlation Function for 2 Billion Galaxies

Friesen, Brian; Patwary, Md. Mostofa Ali; Austin, Brian; Satish, Nadathur; Slepian, Zachary; Sundaram, Narayanan; Bard, Deborah; Eisenstein, Daniel J; Deslippe, Jack; Dubey, Pradeep; Prabhat

Comments: 11 pages, 7 figures, accepted to SuperComputing 2017

Submitted: 2017-08-31

The nature of dark energy and the complete theory of gravity are two central questions currently facing cosmology. A vital tool for addressing them is the 3-point correlation function (3PCF), which probes deviations from a spatially random distribution of galaxies. However, the 3PCF's formidable computational expense has prevented its application to astronomical surveys comprising millions to billions of galaxies. We present Galactos, a high-performance implementation of a novel, O(N^2) algorithm that uses a load-balanced k-d tree and spherical harmonic expansions to compute the anisotropic 3PCF. Our implementation is optimized for the Intel Xeon Phi architecture, exploiting SIMD parallelism, instruction and thread concurrency, and significant L1 and L2 cache reuse, reaching 39% of peak performance on a single node. Galactos scales to the full Cori system, achieving 9.8PF (peak) and 5.06PF (sustained) across 9636 nodes, making the 3PCF easily computable for all galaxies in the observable universe.

[6] oai:arXiv.org:1611.03404 [pdf] - 1511937

Learning an Astronomical Catalog of the Visible Universe through Scalable Bayesian Inference

Regier, Jeffrey; Pamnany, Kiran; Giordano, Ryan; Thomas, Rollin; Schlegel, David; McAuliffe, Jon; Prabhat

Comments: submitting to IPDPS'17

Submitted: 2016-11-10

Celeste is a procedure for inferring astronomical catalogs that attains state-of-the-art scientific results. To date, Celeste has been scaled to at most hundreds of megabytes of astronomical images: Bayesian posterior inference is notoriously demanding computationally. In this paper, we report on a scalable, parallel version of Celeste, suitable for learning catalogs from modern large-scale astronomical datasets. Our algorithmic innovations include a fast numerical optimization routine for Bayesian posterior inference and a statistically efficient scheme for decomposing astronomical optimization problems into subproblems. Our scalable implementation is written entirely in Julia, a new high-level dynamic programming language designed for scientific and numerical computing. We use Julia's high-level constructs for shared and distributed memory parallelism, and demonstrate effective load balancing and efficient scaling on up to 8192 Xeon cores on the NERSC Cori supercomputer.

[7] oai:arXiv.org:1506.01351 [pdf] - 1120672

Celeste: Variational inference for a generative model of astronomical images

Regier, Jeffrey; Miller, Andrew; McAuliffe, Jon; Adams, Ryan; Hoffman, Matt; Lang, Dustin; Schlegel, David; Prabhat

Comments: in the Proceedings of the 32nd International Conference on Machine Learning (2015)

Submitted: 2015-06-03

We present a new, fully generative model of optical telescope image sets, along with a variational procedure for inference. Each pixel intensity is treated as a Poisson random variable, with a rate parameter dependent on latent properties of stars and galaxies. Key latent properties are themselves random, with scientific prior distributions constructed from large ancillary data sets. We check our approach on synthetic images. We also run it on images from a major sky survey, where it exceeds the performance of the current state-of-the-art method for locating celestial bodies and measuring their colors.