Normalized to: Chalk, A.
[1]
oai:arXiv.org:1606.02738 [pdf] - 1420529
SWIFT: Using task-based parallelism, fully asynchronous communication,
and graph partition-based domain decomposition for strong scaling on more
than 100,000 cores
Submitted: 2016-06-08
We present a new open-source cosmological code, called SWIFT, designed to
solve the equations of hydrodynamics using a particle-based approach (Smooth
Particle Hydrodynamics) on hybrid shared/distributed-memory architectures.
SWIFT was designed from the bottom up to provide excellent strong scaling on
both commodity clusters (Tier-2 systems) and Top100-supercomputers (Tier-0
systems), without relying on architecture-specific features or specialized
accelerator hardware. This performance is due to three main computational
approaches: (1) Task-based parallelism for shared-memory parallelism, which
provides fine-grained load balancing and thus strong scaling on large numbers
of cores. (2) Graph-based domain decomposition, which uses the task graph to
decompose the simulation domain such that the work, as opposed to just the
data, as is the case with most partitioning schemes, is equally distributed
across all nodes. (3) Fully dynamic and asynchronous communication, in which
communication is modelled as just another task in the task-based scheme,
sending data whenever it is ready and deferring on tasks that rely on data from
other nodes until it arrives. In order to use these approaches, the code had to
be re-written from scratch, and the algorithms therein adapted to the
task-based paradigm. As a result, we can show upwards of 60% parallel
efficiency for moderate-sized problems when increasing the number of cores
512-fold, on both x86-based and Power8-based architectures.
[2]
oai:arXiv.org:1508.00115 [pdf] - 1254422
SWIFT: task-based hydrodynamics and gravity for cosmological simulations
Submitted: 2015-08-01
Simulations of galaxy formation follow the gravitational and hydrodynamical
interactions between gas, stars and dark matter through cosmic time. The huge
dynamic range of such calculations severely limits strong scaling behaviour of
the community codes in use, with load-imbalance, cache inefficiencies and poor
vectorisation limiting performance. The new swift code exploits task-based
parallelism designed for many-core compute nodes interacting via MPI using
asynchronous communication to improve speed and scaling. A graph-based domain
decomposition schedules interdependent tasks over available resources. Strong
scaling tests on realistic particle distributions yield excellent parallel
efficiency, and efficient cache usage provides a large speed-up compared to
current codes even on a single core. SWIFT is designed to be easy to use by
shielding the astronomer from computational details such as the construction of
the tasks or MPI communication. The techniques and algorithms used in SWIFT may
benefit other computational physics areas as well, for example that of
compressible hydrodynamics. For details of this open-source project, see
www.swiftsim.com
[3]
oai:arXiv.org:1309.3783 [pdf] - 719509
SWIFT: Fast algorithms for multi-resolution SPH on multi-core
architectures
Submitted: 2013-09-15
This paper describes a novel approach to neighbour-finding in Smoothed
Particle Hydrodynamics (SPH) simulations with large dynamic range in smoothing
length. This approach is based on hierarchical cell decompositions, sorted
interactions, and a task-based formulation. It is shown to be faster than
traditional tree-based codes, and to scale better than domain
decomposition-based approaches on shared-memory parallel architectures such as
multi-cores.