Normalized to: Kaehler, R.
[1]
oai:arXiv.org:1612.09491 [pdf] - 1581031
Massively Parallel Computation of Accurate Densities for N-body Dark
Matter Simulations using the Phase-Space-Element Method
Submitted: 2016-12-28, last modified: 2017-08-24
This paper presents an accurate density computation approach for large dark
matter simulations, based on a recently introduced phase-space tessellation
technique and designed for massively parallel, heterogeneous cluster
architectures. We discuss a memory efficient construction of an oct-tree
structure to sample the mass densities with locally adaptive resolution,
according to the features of the underlying tetrahedral tessellation. We
propose an efficient GPU implementation for the computationally intensive
operation of intersecting the tetrahedra with the cubical cells of the deposit
grid, that achieves a speedup of almost an order of magnitude compared to an
optimized CPU version. We discuss two dynamic load balancing schemes - the
first exchanges particle data between cluster nodes and deposits all tetrahedra
for each block of the grid structure on single nodes, whereas the second
approach uses global reduction operations to obtain the total masses. We
demonstrate the scalability of our algorithms for up to 256 GPUs and TB-sized
simulation snapshots, resulting in tessellations with over 400 billion
tetrahedra.
[2]
oai:arXiv.org:1210.6652 [pdf] - 1152420
A new approach to simulating collisionless dark matter fluids
Submitted: 2012-10-24, last modified: 2013-06-14
Recently, we have shown how current cosmological N-body codes already follow
the fine grained phase-space information of the dark matter fluid. Using a
tetrahedral tesselation of the three-dimensional manifold that describes
perfectly cold fluids in six-dimensional phase space, the phase-space
distribution function can be followed throughout the simulation. This allows
one to project the distribution function into configuration space to obtain
highly accurate densities, velocities, and velocity dispersions. Here, we
exploit this technique to show first steps on how to devise an improved
particle-mesh technique. At its heart, the new method thus relies on a
piecewise linear approximation of the phase space distribution function rather
than the usual particle discretisation. We use pseudo-particles that
approximate the masses of the tetrahedral cells up to quadrupolar order as the
locations for cloud-in-cell (CIC) deposit instead of the particle locations
themselves as in standard CIC deposit. We demonstrate that this modification
already gives much improved stability and more accurate dynamics of the
collisionless dark matter fluid at high force and low mass resolution. We
demonstrate the validity and advantages of this method with various test
problems as well as hot/warm-dark matter simulations which have been known to
exhibit artificial fragmentation. This completely unphysical behaviour is much
reduced in the new approach. The current limitations of our approach are
discussed in detail and future improvements are outlined.
[3]
oai:arXiv.org:1212.3333 [pdf] - 1158508
Single-Pass GPU-Raycasting for Structured Adaptive Mesh Refinement Data
Submitted: 2012-12-13
Structured Adaptive Mesh Refinement (SAMR) is a popular numerical technique
to study processes with high spatial and temporal dynamic range. It reduces
computational requirements by adapting the lattice on which the underlying
differential equations are solved to most efficiently represent the solution.
Particularly in astrophysics and cosmology such simulations now can capture
spatial scales ten orders of magnitude apart and more. The irregular locations
and extensions of the refined regions in the SAMR scheme and the fact that
different resolution levels partially overlap, poses a challenge for GPU-based
direct volume rendering methods. kD-trees have proven to be advantageous to
subdivide the data domain into non-overlapping blocks of equally sized cells,
optimal for the texture units of current graphics hardware, but previous
GPU-supported raycasting approaches for SAMR data using this data structure
required a separate rendering pass for each node, preventing the application of
many advanced lighting schemes that require simultaneous access to more than
one block of cells. In this paper we present a single-pass GPU-raycasting
algorithm for SAMR data that is based on a kD-tree. The tree is efficiently
encoded by a set of 3D-textures, which allows to adaptively sample complete
rays entirely on the GPU without any CPU interaction. We discuss two different
data storage strategies to access the grid data on the GPU and apply them to
several datasets to prove the benefits of the proposed method.
[4]
oai:arXiv.org:1208.3206 [pdf] - 1516221
A Novel Approach to Visualizing Dark Matter Simulations
Submitted: 2012-08-15
In the last decades cosmological N-body dark matter simulations have enabled
ab initio studies of the formation of structure in the Universe. Gravity
amplified small density fluctuations generated shortly after the Big Bang,
leading to the formation of galaxies in the cosmic web. These calculations have
led to a growing demand for methods to analyze time-dependent particle based
simulations. Rendering methods for such N-body simulation data usually employ
some kind of splatting approach via point based rendering primitives and
approximate the spatial distributions of physical quantities using kernel
interpolation techniques, common in SPH (Smoothed Particle
Hydrodynamics)-codes. This paper proposes three GPU-assisted rendering
approaches, based on a new, more accurate method to compute the physical
densities of dark matter simulation data. It uses full phase-space information
to generate a tetrahedral tessellation of the computational domain, with mesh
vertices defined by the simulation's dark matter particle positions. Over time
the mesh is deformed by gravitational forces, causing the tetrahedral cells to
warp and overlap. The new methods are well suited to visualize the cosmic web.
In particular they preserve caustics, regions of high density that emerge, when
several streams of dark matter particles share the same location in space,
indicating the formation of structures like sheets, filaments and halos. We
demonstrate the superior image quality of the new approaches in a comparison
with three standard rendering techniques for N-body simulation data.
[5]
oai:arXiv.org:1111.3944 [pdf] - 1091710
Tracing the Dark Matter Sheet in Phase Space
Submitted: 2011-11-16, last modified: 2012-06-25
The primordial velocity dispersion of dark matter is small compared to the
velocities attained during structure formation. The initial density
distribution is close to uniform and it occupies an initial sheet in phase
space that is single valued in velocity space. Because of gravitational forces
this three dimensional manifold evolves in phase space without ever tearing,
conserving phase-space volume and preserving the connectivity of nearby points.
N-body simulations already follow the motion of this sheet in phase space. This
fact can be used to extract full fine-grained phase-space-structure information
from existing cosmological N-body simulations. Particles are considered as the
vertices of an unstructured three dimensional mesh, moving in six dimensional
phase-space. On this mesh, mass density and momentum are uniquely defined. We
show how to obtain the space density of the fluid, detect caustics, and count
the number of streams as well as their individual contributions to any point in
configuration-space. We calculate the bulk velocity, local velocity
dispersions, and densities from the sheet - all without averaging over control
volumes. This gives a wealth of new information about dark matter fluid flow
which had previously been thought of as inaccessible to N-body simulations. We
outline how this mapping may be used to create new accurate collisionless fluid
simulation codes that may be able to overcome the sparse sampling and
unphysical two-body effects that plague current N-body techniques.
[6]
oai:arXiv.org:0910.5547 [pdf] - 902234
Adaptive Mesh Fluid Simulations on GPU
Submitted: 2009-10-28
We describe an implementation of compressible inviscid fluid solvers with
block-structured adaptive mesh refinement on Graphics Processing Units using
NVIDIA's CUDA. We show that a class of high resolution shock capturing schemes
can be mapped naturally on this architecture. Using the method of lines
approach with the second order total variation diminishing Runge-Kutta time
integration scheme, piecewise linear reconstruction, and a Harten-Lax-van Leer
Riemann solver, we achieve an overall speedup of approximately 10 times faster
execution on one graphics card as compared to a single core on the host
computer. We attain this speedup in uniform grid runs as well as in problems
with deep AMR hierarchies. Our framework can readily be applied to more general
systems of conservation laws and extended to higher order shock capturing
schemes. This is shown directly by an implementation of a magneto-hydrodynamic
solver and comparing its performance to the pure hydrodynamic case. Finally, we
also combined our CUDA parallel scheme with MPI to make the code run on GPU
clusters. Close to ideal speedup is observed on up to four GPUs.