Normalized to: Nomura, K.
[1]
oai:arXiv.org:1912.10210 [pdf] - 2098767
Pulsar timing residual induced by ultralight vector dark matter
Submitted: 2019-12-21, last modified: 2020-05-20
We study the ultralight vector dark matter with a mass around
$10^{-23}\,\mathrm{eV}$. The vector field oscillating coherently on galactic
scales induces oscillations of the spacetime metric with a frequency around
nHz, which is detectable by pulsar timing arrays. We find that the pulsar
timing signal due to the vector dark matter has nontrivial angular dependence
unlike the scalar dark matter and the maximal amplitude is three times larger
than that of the scalar dark matter.
[2]
oai:arXiv.org:1907.02290 [pdf] - 2046222
Accelerated FDPS --- Algorithms to Use Accelerators with FDPS
Submitted: 2019-07-04
In this paper, we describe the algorithms we implemented in FDPS to make
efficient use of accelerator hardware such as GPGPUs. We have developed FDPS to
make it possible for many researchers to develop their own high-performance
parallel particle-based simulation programs without spending large amount of
time for parallelization and performance tuning. The basic idea of FDPS is to
provide a high-performance implementation of parallel algorithms for
particle-based simulations in a "generic" form, so that researchers can define
their own particle data structure and interparticle interaction functions and
supply them to FDPS. FDPS compiled with user-supplied data type and interaction
function provides all necessary functions for parallelization, and using those
functions researchers can write their programs as though they are writing
simple non-parallel program. It has been possible to use accelerators with
FDPS, by writing the interaction function that uses the accelerator. However,
the efficiency was limited by the latency and bandwidth of communication
between the CPU and the accelerator and also by the mismatch between the
available degree of parallelism of the interaction function and that of the
hardware parallelism. We have modified the interface of user-provided
interaction function so that accelerators are more efficiently used. We also
implemented new techniques which reduce the amount of work on the side of CPU
and amount of communication between CPU and accelerators. We have measured the
performance of N-body simulations on a systems with NVIDIA Volta GPGPU using
FDPS and the achieved performance is around 27 \% of the theoretical peak
limit. We have constructed a detailed performance model, and found that the
current implementation can achieve good performance on systems with much
smaller memory and communication bandwidth.
[3]
oai:arXiv.org:1804.08935 [pdf] - 1705276
Fortran interface layer of the framework for developing particle
simulator FDPS
Submitted: 2018-04-24, last modified: 2018-04-25
Numerical simulations based on particle methods have been widely used in
various fields including astrophysics. To date, simulation softwares have been
developed by individual researchers or research groups in each field, with a
huge amount of time and effort, even though numerical algorithms used are very
similar. To improve the situation, we have developed a framework, called FDPS,
which enables researchers to easily develop massively parallel particle
simulation codes for arbitrary particle methods. Until version 3.0, FDPS have
provided API only for C++ programing language. This limitation comes from the
fact that FDPS is developed using the template feature in C++, which is
essential to support arbitrary data types of particle. However, there are many
researchers who use Fortran to develop their codes. Thus, the previous versions
of FDPS require such people to invest much time to learn C++. This is
inefficient. To cope with this problem, we newly developed a Fortran interface
layer in FDPS, which provides API for Fortran. In order to support arbitrary
data types of particle in Fortran, we design the Fortran interface layer as
follows. Based on a given derived data type in Fortran representing particle, a
Python script provided by us automatically generates a library that manipulates
the C++ core part of FDPS. This library is seen as a Fortran module providing
API of FDPS from the Fortran side and uses C programs internally to
interoperate Fortran with C++. In this way, we have overcome several technical
issues when emulating `template' in Fortran. By using the Fortran interface,
users can develop all parts of their codes in Fortran. We show that the
overhead of the Fortran interface part is sufficiently small and a code written
in Fortran shows a performance practically identical to the one written in C++.