Normalized to: Bode, A.
[1]
oai:arXiv.org:1609.01507 [pdf] - 1475588
Extreme Scale-out SuperMUC Phase 2 - lessons learned
Hammer, Nicolay;
Jamitzky, Ferdinand;
Satzger, Helmut;
Allalen, Momme;
Block, Alexander;
Karmakar, Anupam;
Brehm, Matthias;
Bader, Reinhold;
Iapichino, Luigi;
Ragagnin, Antonio;
Karakasis, Vasilios;
Kranzlmüller, Dieter;
Bode, Arndt;
Huber, Herbert;
Kühn, Martin;
Machado, Rui;
Grünewald, Daniel;
Edelmann, Philipp V. F.;
Röpke, Friedrich K.;
Wittmann, Markus;
Zeiser, Thomas;
Wellein, Gerhard;
Mathias, Gerald;
Schwörer, Magnus;
Lorenzen, Konstantin;
Federrath, Christoph;
Klessen, Ralf;
Bamberg, Karl-Ulrich;
Ruhl, Hartmut;
Schornbaum, Florian;
Bauer, Martin;
Nikhil, Anand;
Qi, Jiaxing;
Klimach, Harald;
Stüben, Hinnerk;
Deshmukh, Abhishek;
Falkenstein, Tobias;
Dolag, Klaus;
Petkova, Margarita
Submitted: 2016-09-06
In spring 2015, the Leibniz Supercomputing Centre (Leibniz-Rechenzentrum,
LRZ), installed their new Peta-Scale System SuperMUC Phase2. Selected users
were invited for a 28 day extreme scale-out block operation during which they
were allowed to use the full system for their applications. The following
projects participated in the extreme scale-out workshop: BQCD (Quantum
Physics), SeisSol (Geophysics, Seismics), GPI-2/GASPI (Toolkit for HPC),
Seven-League Hydro (Astrophysics), ILBDC (Lattice Boltzmann CFD), Iphigenie
(Molecular Dynamic), FLASH (Astrophysics), GADGET (Cosmological Dynamics), PSC
(Plasma Physics), waLBerla (Lattice Boltzmann CFD), Musubi (Lattice Boltzmann
CFD), Vertex3D (Stellar Astrophysics), CIAO (Combustion CFD), and LS1-Mardyn
(Material Science). The projects were allowed to use the machine exclusively
during the 28 day period, which corresponds to a total of 63.4 million
core-hours, of which 43.8 million core-hours were used by the applications,
resulting in a utilization of 69%. The top 3 users were using 15.2, 6.4, and
4.7 million core-hours, respectively.
[2]
oai:arXiv.org:hep-lat/9507021 [pdf] - 113407
Hyper-Systolic Parallel Computing
Submitted: 1995-07-25
A new class of parallel algorithms is introduced that can achieve a
complexity of O(n^3/2) with respect to the interprocessor communication, in the
exact computation of systems with pairwise mutual interactions of all elements.
Hitherto, conventional methods exhibit a communicational complexity of O(n^2).
The amount of computation operations is not altered for the new algorithm which
can be formulated as a kind of h-range problem, known from the mathematical
field of Additive Number Theory. We will demonstrate the reduction in
communicational expense by comparing the standard-systolic algorithm and the
new algorithm on the connection machine CM5 and the CRAY T3D. The parallel
method can be useful in various scientific and engineering fields like exact
n-body dynamics with long range forces, polymer chains, protein folding or
signal processing.