Normalized to: Karna, T.
[1]
oai:arXiv.org:1808.04728 [pdf] - 1782849
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Mathuriya, Amrita;
Bard, Deborah;
Mendygral, Peter;
Meadows, Lawrence;
Arnemann, James;
Shao, Lei;
He, Siyu;
Karna, Tuomas;
Moise, Daina;
Pennycook, Simon J.;
Maschoff, Kristyn;
Sewall, Jason;
Kumar, Nalini;
Ho, Shirley;
Ringenburg, Mike;
Prabhat;
Lee, Victor
Submitted: 2018-08-14, last modified: 2018-11-09
Deep learning is a promising tool to determine the physical model that
describes our universe. To handle the considerable computational cost of this
problem, we present CosmoFlow: a highly scalable deep learning application
built on top of the TensorFlow framework. CosmoFlow uses efficient
implementations of 3D convolution and pooling primitives, together with
improvements in threading for many element-wise operations, to improve training
performance on Intel(C) Xeon Phi(TM) processors. We also utilize the Cray PE
Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate
fully synchronous data-parallel training on 8192 nodes of Cori with 77%
parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our
knowledge, this is the first large-scale science application of the TensorFlow
framework at supercomputer scale with fully-synchronous training. These
enhancements enable us to process large 3D dark matter distribution and predict
the cosmological parameters $\Omega_M$, $\sigma_8$ and n$_s$ with unprecedented
accuracy.