Normalized to: Karakasis, V.
[1]
oai:arXiv.org:1612.06090 [pdf] - 1580943
Performance Optimisation of Smoothed Particle Hydrodynamics Algorithms
for Multi/Many-Core Architectures
Submitted: 2016-12-19, last modified: 2017-05-10
We describe a strategy for code modernisation of Gadget, a widely used
community code for computational astrophysics. The focus of this work is on
node-level performance optimisation, targeting current multi/many-core IntelR
architectures. We identify and isolate a sample code kernel, which is
representative of a typical Smoothed Particle Hydrodynamics (SPH) algorithm.
The code modifications include threading parallelism optimisation, change of
the data layout into Structure of Arrays (SoA), auto-vectorisation and
algorithmic improvements in the particle sorting. We obtain shorter execution
time and improved threading scalability both on Intel XeonR ($2.6 \times$ on
Ivy Bridge) and Xeon PhiTM ($13.7 \times$ on Knights Corner) systems. First few
tests of the optimised code result in $19.1 \times$ faster execution on second
generation Xeon Phi (Knights Landing), thus demonstrating the portability of
the devised optimisation solutions to upcoming architectures.
[2]
oai:arXiv.org:1609.01507 [pdf] - 1475588
Extreme Scale-out SuperMUC Phase 2 - lessons learned
Hammer, Nicolay;
Jamitzky, Ferdinand;
Satzger, Helmut;
Allalen, Momme;
Block, Alexander;
Karmakar, Anupam;
Brehm, Matthias;
Bader, Reinhold;
Iapichino, Luigi;
Ragagnin, Antonio;
Karakasis, Vasilios;
Kranzlmüller, Dieter;
Bode, Arndt;
Huber, Herbert;
Kühn, Martin;
Machado, Rui;
Grünewald, Daniel;
Edelmann, Philipp V. F.;
Röpke, Friedrich K.;
Wittmann, Markus;
Zeiser, Thomas;
Wellein, Gerhard;
Mathias, Gerald;
Schwörer, Magnus;
Lorenzen, Konstantin;
Federrath, Christoph;
Klessen, Ralf;
Bamberg, Karl-Ulrich;
Ruhl, Hartmut;
Schornbaum, Florian;
Bauer, Martin;
Nikhil, Anand;
Qi, Jiaxing;
Klimach, Harald;
Stüben, Hinnerk;
Deshmukh, Abhishek;
Falkenstein, Tobias;
Dolag, Klaus;
Petkova, Margarita
Submitted: 2016-09-06
In spring 2015, the Leibniz Supercomputing Centre (Leibniz-Rechenzentrum,
LRZ), installed their new Peta-Scale System SuperMUC Phase2. Selected users
were invited for a 28 day extreme scale-out block operation during which they
were allowed to use the full system for their applications. The following
projects participated in the extreme scale-out workshop: BQCD (Quantum
Physics), SeisSol (Geophysics, Seismics), GPI-2/GASPI (Toolkit for HPC),
Seven-League Hydro (Astrophysics), ILBDC (Lattice Boltzmann CFD), Iphigenie
(Molecular Dynamic), FLASH (Astrophysics), GADGET (Cosmological Dynamics), PSC
(Plasma Physics), waLBerla (Lattice Boltzmann CFD), Musubi (Lattice Boltzmann
CFD), Vertex3D (Stellar Astrophysics), CIAO (Combustion CFD), and LS1-Mardyn
(Material Science). The projects were allowed to use the machine exclusively
during the 28 day period, which corresponds to a total of 63.4 million
core-hours, of which 43.8 million core-hours were used by the applications,
resulting in a utilization of 69%. The top 3 users were using 15.2, 6.4, and
4.7 million core-hours, respectively.