Normalized to: Jäykkä, J.
[1]
oai:arXiv.org:1503.08809 [pdf] - 1347438
Separable projection integrals for higher-order correlators of the
cosmic microwave sky: Acceleration by factors exceeding 100
Submitted: 2015-03-30, last modified: 2016-01-26
We present a case study describing efforts to optimise and modernise "Modal",
the simulation and analysis pipeline used by the Planck satellite experiment
for constraining general non-Gaussian models of the early universe via the
bispectrum (or three-point correlator) of the cosmic microwave background
radiation. We focus on one particular element of the code: the projection of
bispectra from the end of inflation to the spherical shell at decoupling, which
defines the CMB we observe today. This code involves a three-dimensional inner
product between two functions, one of which requires an integral, on a
non-rectangular domain containing a sparse grid. We show that by employing
separable methods this calculation can be reduced to a one-dimensional
summation plus two integrations, reducing the overall dimensionality from four
to three. The introduction of separable functions also solves the issue of the
non-rectangular sparse grid. This separable method can become unstable in
certain cases and so the slower non-separable integral must be calculated
instead. We present a discussion of the optimisation of both approaches. We
show significant speed-ups of ~100x, arising from a combination of algorithmic
improvements and architecture-aware optimisations targeted at improving thread
and vectorisation behaviour. The resulting MPI/OpenMP hybrid code is capable of
executing on clusters containing processors and/or coprocessors, with
strong-scaling efficiency of 98.6% on up to 16 nodes. We find that a single
coprocessor outperforms two processor sockets by a factor of 1.3x and that
running the same code across a combination of both microarchitectures improves
performance-per-node by a factor of 3.38x. By making bispectrum calculations
competitive with those for the power spectrum (or two-point correlator) we are
now able to consider joint analysis for cosmological science exploitation of
new data.