Normalized to: Hammer, N.
[1]
oai:arXiv.org:1810.09898 [pdf] - 1779727
Exploiting the Space Filling Curve Ordering of Particles in the
Neighbour Search of Gadget3
Submitted: 2018-10-23
Gadget3 is nowadays one of the most frequently used high performing parallel
codes for cosmological hydrodynamical simulations. Recent analyses have shown
t\ hat the Neighbour Search process of Gadget3 is one of the most
time-consuming parts. Thus, a considerable speedup can be expected from
improvements of the u\ nderlying algorithms. In this work we propose a novel
approach for speeding up the Neighbour Search which takes advantage of the
space-filling-curve particle ordering. Instead of performing Neighbour Search
for all particles individually, nearby active particles can be grouped and one
single Neighbour Search can be performed to obta\ in a common superset of
neighbours. Thus, with this approach we reduce the number of searches. On the
other hand, tree walks are performed within a larger searching radius. There is
an optimal size of grouping that maximize the speedup, which we found by
numerical experiments. We tested the algorithm within the boxes of the
Magneticum project. As a result we obtained a speedup of $1.65$ in the Density
and of $1.30$ in the Hydrodynamics computation, respectively, and a total
speedup of $1.34.$
[2]
oai:arXiv.org:1612.06380 [pdf] - 1769924
A web portal for hydrodynamical, cosmological simulations
Submitted: 2016-12-19, last modified: 2018-10-19
This article describes a data center hosting a web portal for accessing and
sharing the output of large, cosmological, hydro-dynamical simulations with a
broad scientific community. It also allows users to receive related scientific
data products by directly processing the raw simulation data on a remote
computing cluster. The data center has a multi-layer structure: a web portal, a
job control layer, a computing cluster and a HPC storage system. The outer
layer enables users to choose an object from the simulations. Objects can be
selected by visually inspecting 2D maps of the simulation data, by performing
highly compounded and elaborated queries or graphically by plotting arbitrary
combinations of properties. The user can run analysis tools on a chosen object.
These services allow users to run analysis tools on the raw simulation data.
The job control layer is responsible for handling and performing the analysis
jobs, which are executed on a computing cluster. The innermost layer is formed
by a HPC storage system which hosts the large, raw simulation data. The
following services are available for the users: (I) {\sc ClusterInspect}
visualizes properties of member galaxies of a selected galaxy cluster; (II)
{\sc SimCut} returns the raw data of a sub-volume around a selected object from
a simulation, containing all the original, hydro-dynamical quantities; (III)
{\sc Smac} creates idealised 2D maps of various, physical quantities and
observables of a selected object; (IV) {\sc Phox} generates virtual X-ray
observations with specifications of various current and upcoming instruments.
[3]
oai:arXiv.org:1612.06090 [pdf] - 1580943
Performance Optimisation of Smoothed Particle Hydrodynamics Algorithms
for Multi/Many-Core Architectures
Submitted: 2016-12-19, last modified: 2017-05-10
We describe a strategy for code modernisation of Gadget, a widely used
community code for computational astrophysics. The focus of this work is on
node-level performance optimisation, targeting current multi/many-core IntelR
architectures. We identify and isolate a sample code kernel, which is
representative of a typical Smoothed Particle Hydrodynamics (SPH) algorithm.
The code modifications include threading parallelism optimisation, change of
the data layout into Structure of Arrays (SoA), auto-vectorisation and
algorithmic improvements in the particle sorting. We obtain shorter execution
time and improved threading scalability both on Intel XeonR ($2.6 \times$ on
Ivy Bridge) and Xeon PhiTM ($13.7 \times$ on Knights Corner) systems. First few
tests of the optimised code result in $19.1 \times$ faster execution on second
generation Xeon Phi (Knights Landing), thus demonstrating the portability of
the devised optimisation solutions to upcoming architectures.
[4]
oai:arXiv.org:1609.01507 [pdf] - 1475588
Extreme Scale-out SuperMUC Phase 2 - lessons learned
Hammer, Nicolay;
Jamitzky, Ferdinand;
Satzger, Helmut;
Allalen, Momme;
Block, Alexander;
Karmakar, Anupam;
Brehm, Matthias;
Bader, Reinhold;
Iapichino, Luigi;
Ragagnin, Antonio;
Karakasis, Vasilios;
Kranzlmüller, Dieter;
Bode, Arndt;
Huber, Herbert;
Kühn, Martin;
Machado, Rui;
Grünewald, Daniel;
Edelmann, Philipp V. F.;
Röpke, Friedrich K.;
Wittmann, Markus;
Zeiser, Thomas;
Wellein, Gerhard;
Mathias, Gerald;
Schwörer, Magnus;
Lorenzen, Konstantin;
Federrath, Christoph;
Klessen, Ralf;
Bamberg, Karl-Ulrich;
Ruhl, Hartmut;
Schornbaum, Florian;
Bauer, Martin;
Nikhil, Anand;
Qi, Jiaxing;
Klimach, Harald;
Stüben, Hinnerk;
Deshmukh, Abhishek;
Falkenstein, Tobias;
Dolag, Klaus;
Petkova, Margarita
Submitted: 2016-09-06
In spring 2015, the Leibniz Supercomputing Centre (Leibniz-Rechenzentrum,
LRZ), installed their new Peta-Scale System SuperMUC Phase2. Selected users
were invited for a 28 day extreme scale-out block operation during which they
were allowed to use the full system for their applications. The following
projects participated in the extreme scale-out workshop: BQCD (Quantum
Physics), SeisSol (Geophysics, Seismics), GPI-2/GASPI (Toolkit for HPC),
Seven-League Hydro (Astrophysics), ILBDC (Lattice Boltzmann CFD), Iphigenie
(Molecular Dynamic), FLASH (Astrophysics), GADGET (Cosmological Dynamics), PSC
(Plasma Physics), waLBerla (Lattice Boltzmann CFD), Musubi (Lattice Boltzmann
CFD), Vertex3D (Stellar Astrophysics), CIAO (Combustion CFD), and LS1-Mardyn
(Material Science). The projects were allowed to use the machine exclusively
during the 28 day period, which corresponds to a total of 63.4 million
core-hours, of which 43.8 million core-hours were used by the applications,
resulting in a utilization of 69%. The top 3 users were using 15.2, 6.4, and
4.7 million core-hours, respectively.
[5]
oai:arXiv.org:1607.00630 [pdf] - 1432405
The world's largest turbulence simulations
Submitted: 2016-07-03
Understanding turbulence is critical for a wide range of terrestrial and
astrophysical applications. Here we present first results of the world's
highest-resolution simulation of turbulence ever done. The current simulation
has a grid resolution of 10048^3 points and was performed on 65536 compute
cores on SuperMUC at the Leibniz Supercomputing Centre (LRZ). We present a
scaling test of our modified version of the FLASH code, which updates the
hydrodynamical equations in less than 3 micro seconds per cell per time step. A
first look at the column density structure of the 10048^3 simulation is
presented and a detailed analysis is provided in a forthcoming paper.
[6]
oai:arXiv.org:0908.3474 [pdf] - 1017191
Three-Dimensional Simulations of Mixing Instabilities in Supernova
Explosions
Submitted: 2009-08-24, last modified: 2010-03-31
We present the first three-dimensional (3D) simulations of the large-scale
mixing that takes place in the shock-heated stellar layers ejected in the
explosion of a 15.5 solar-mass blue supergiant star. The outgoing supernova
shock is followed from its launch by neutrino heating until it breaks out from
the stellar surface more than two hours after the core collapse. Violent
convective overturn in the post-shock layer causes the explosion to start with
significant asphericity, which triggers the growth of Rayleigh-Taylor (RT)
instabilities at the composition interfaces of the exploding star. Deep inward
mixing of hydrogen (H) is found as well as fast-moving, metal-rich clumps
penetrating with high velocities far into the H-envelope of the star as
observed, e.g., in the case of SN 1987A. Also individual clumps containing a
sizeable fraction of the ejected iron-group elements (up to several 0.001 solar
masses) are obtained in some models. The metal core of the progenitor is
partially turned over with Ni-dominated fingers overtaking oxygen-rich bullets
and both Ni and O moving well ahead of the material from the carbon layer.
Comparing with corresponding 2D (axially symmetric) calculations, we determine
the growth of the RT fingers to be faster, the deceleration of the dense
metal-carrying clumps in the He and H layers to be reduced, the asymptotic
clump velocities in the H-shell to be higher (up to ~4500 km/s for the
considered progenitor and an explosion energy of 10^{51} ergs, instead of <2000
km/s in 2D), and the outward radial mixing of heavy elements and inward mixing
of hydrogen to be more efficient in 3D than in 2D. We present a simple argument
that explains these results as a consequence of the different action of drag
forces on moving objects in the two geometries. (abridged)
[7]
oai:arXiv.org:1003.1633 [pdf] - 173541
An axis-free overset grid in spherical polar coordinates for simulating
3D self-gravitating flows
Submitted: 2010-03-08
A type of overlapping grid in spherical coordinates called the Yin-Yang grid
is successfully implemented into a 3D version of the explicit Eulerian
grid-based code PROMETHEUS including self-gravity. The modified code
successfully passed several standard hydrodynamic tests producing results which
are in very good agreement with analytic solutions. Moreover, the solutions
obtained with the Yin-Yang grid exhibit no peculiar behaviour at the boundary
between the two grid patches. The code has also been successfully used to model
astrophysically relevant situations, namely equilibrium polytropes, a
Taylor-Sedov explosion, and Rayleigh-Taylor instabilities. According to our
results, the usage of the Yin-Yang grid greatly enhances the suitability and
efficiency of 3D explicit Eulerian codes based on spherical polar coordinates
for astrophysical flows.
[8]
oai:arXiv.org:astro-ph/0601546 [pdf] - 79404
VLT spectroscopy and non-LTE modeling of the C/O-dominated accretion
disks in two ultracompact X-ray binaries
Submitted: 2006-01-24
We present new medium-resolution high-S/N optical spectra of the ultracompact
low-mass X-ray binaries 4U0614+091 and 4U1626-67, taken with the ESO Very Large
Telescope. They are pure emission line spectra and the lines are identified as
due to C II-IV and O II-III Line identification is corroborated by first
results from modeling the disk spectra with detailed non-LTE radiation transfer
calculations. Hydrogen and helium lines are lacking in the observed spectra.
Our models confirm the deficiency of H and He in the disks. The lack of neon
lines suggests an Ne abundance of less than about 10 percent (by mass),
however, this result is uncertain due to possible shortcomings in the model
atom. These findings suggest that the donor stars are eroded cores of C/O white
dwarfs with no excessive neon overabundance. This would contradict earlier
claims of Ne enrichment concluded from X-ray observations of circumbinary
material, which was explained by crystallization and fractionation of the white
dwarf core.
[9]
oai:arXiv.org:astro-ph/0410690 [pdf] - 68539
On Possible Oxygen/Neon White Dwarfs: H1504+65 and the White Dwarf
Donors in Ultracompact X-ray Binaries
Submitted: 2004-10-28
We discuss the possibility to detect O/Ne white dwarfs by evidence for Ne
overabundances. The hottest known WD, H1504+65, could be the only single WD for
which we might be able to proof its O/Ne nature directly. Apart from this,
strong Ne abundances are known or suspected only from binary systems, namely
from a few novae, and from a handful of LMXBs. We try to verify the hypothesis
that the latter might host strongly ablated O/Ne WD donors, by abundance
analyses of the accretion disks in these systems. In any case, to conclude on
O/Ne WDs just by strong Ne overabundances is problematic, because Ne enrichment
also occurs by settling of this species into the core of C/O WDs.
[10]
oai:arXiv.org:astro-ph/0311215 [pdf] - 60795
Spectroscopic Analyses of the Blue Hook Stars in NGC 2808: A More
Stringent Test of the Late Hot Flasher Scenario
Submitted: 2003-11-10
Recent UV observations of the globular cluster NGC2808 (Brown et al. 2001)
show a significant population of hot stars fainter than the zero-age horizontal
branch (``blue hook'' stars), which cannot be explained by canonical stellar
evolution. Their results suggest that stars which experience unusually large
mass loss on the red giant branch and which subsequently undergo the helium
core flash while descending the white dwarf cooling curve could populate this
region. Theory predicts that these ``late hot flashers'' should show higher
temperatures than the hottest canonical horizontal branch (HB) stars and should
have He- and C-rich atmospheres. As a test of this late hot flasher scenario,
we have obtained and analysed medium resolution spectra of a sample of blue
hook stars in NGC2808 to derive their atmospheric parameters. Using the same
procedures, we have also re-analyzed our earlier spectra of the blue hook stars
in omega Cen (Moehler et al. 2002) for comparison with the present results for
NGC2808. The blue hook stars in these two clusters are both hotter (Teff >
35,000 K) and more helium-rich than canonical extreme HB stars in agreement
with the late hot flasher scenario. Moreover, we find indications for C
enhancement in the three most He-enriched stars in NGC2808. However, the blue
hook stars still show some H in their atmospheres, perhaps indicating that some
residual H survives a late hot flash and then later diffuses to the surface
during the HB phase. We note that the presence of blue hook stars apparently
depends mostly on the total mass of the globular cluster and not so much on its
HB morphology.