Full-text search for arXiv

16 article(s) in total. 23 co-authors, from 1 to 14 common article(s). Median position in authors list is 1,5.

[1] oai:arXiv.org:2006.16560 [pdf] - 2125009

PeTar: a high-performance N-body code for modeling massive collisional stellar systems

Wang, Long; Iwasawa, Masaki; Nitadori, Keigo; Makino, Junichiro

Comments: 20 pages, 17 figures, accepted for MNRAS

Submitted: 2020-06-30

The numerical simulations of massive collisional stellar systems, such as globular clusters (GCs), are very time-consuming. Until now, only a few realistic million-body simulations of GCs with a small fraction of binaries (5%) have been performed by using the NBODY6++GPU code. Such models took half a year computational time on a GPU based super-computer. In this work, we develop a new N-body code, PeTar, by combining the methods of Barnes-Hut tree, Hermite integrator and slow-down algorithmic regularization (SDAR). The code can accurately handle an arbitrary fraction of multiple systems (e.g. binaries, triples) while keeping a high performance by using the hybrid parallelization methods with MPI, OpenMP, SIMD instructions and GPU. A few benchmarks indicate that PeTar and NBODY6++GPU have a very good agreement on the long-term evolution of the global structure, binary orbits and escapers. On a highly configured GPU desktop computer, the performance of a million-body simulation with all stars in binaries by using PeTar is 11 times faster than that of NBODY6++GPU. Moreover, on the Cray XC50 supercomputer, PeTar well scales when number of cores increase. The ten million-body problem, which covers the region of ultra compact dwarfs and nuclearstar clusters, becomes possible to be solved.

[2] oai:arXiv.org:1907.02290 [pdf] - 2046222

Accelerated FDPS --- Algorithms to Use Accelerators with FDPS

Iwasawa, Masaki; Namekata, Daisuke; Nitadori, Keigo; Nomura, Kentaro; Wang, Long; Tsubouchi, Miyuki; Makino, Junichiro

Comments:

Submitted: 2019-07-04

In this paper, we describe the algorithms we implemented in FDPS to make efficient use of accelerator hardware such as GPGPUs. We have developed FDPS to make it possible for many researchers to develop their own high-performance parallel particle-based simulation programs without spending large amount of time for parallelization and performance tuning. The basic idea of FDPS is to provide a high-performance implementation of parallel algorithms for particle-based simulations in a "generic" form, so that researchers can define their own particle data structure and interparticle interaction functions and supply them to FDPS. FDPS compiled with user-supplied data type and interaction function provides all necessary functions for parallelization, and using those functions researchers can write their programs as though they are writing simple non-parallel program. It has been possible to use accelerators with FDPS, by writing the interaction function that uses the accelerator. However, the efficiency was limited by the latency and bandwidth of communication between the CPU and the accelerator and also by the mismatch between the available degree of parallelism of the interaction function and that of the hardware parallelism. We have modified the interface of user-provided interaction function so that accelerators are more efficiently used. We also implemented new techniques which reduce the amount of work on the side of CPU and amount of communication between CPU and accelerators. We have measured the performance of N-body simulations on a systems with NVIDIA Volta GPGPU using FDPS and the achieved performance is around 27 \% of the theoretical peak limit. We have constructed a detailed performance model, and found that the current implementation can achieve good performance on systems with much smaller memory and communication bandwidth.

[3] oai:arXiv.org:1907.02289 [pdf] - 1910783

Implementation and Performance of Barnes-Hut N-body algorithm on Extreme-scale Heterogeneous Many-core Architectures

Iwasawa, Masaki; Namekata, Daisuke; Sakamoto, Ryo; Nakamura, Takashi; Kimura, Yasuyuki; Nitadori, Keigo; Wang, Long; Tsubouchi, Miyuki; Makino, Jun; Liu, Zhao; Fu, Haohuan; Yang, Guangwen

Comments:

Submitted: 2019-07-04

In this paper, we report the implementation and measured performance of our extreme-scale global simulation code on Sunway TaihuLight and two PEZY-SC2 systems: Shoubu System B and Gyoukou. The numerical algorithm is the parallel Barnes-Hut tree algorithm, which has been used in many large-scale astrophysical particle-based simulations. Our implementation is based on our FDPS framework. However, the extremely large numbers of cores of the systems used (10M on TaihuLight and 16M on Gyoukou) and their relatively poor memory and network bandwidth pose new challenges. We describe the new algorithms introduced to achieve high efficiency on machines with low memory bandwidth. The measured performance is 47.9, 10.6 PF, and 1.01PF on TaihuLight, Gyoukou and Shoubu System B (efficiency 40\%, 23.5\% and 35.5\%). The current code is developed for the simulation of planetary rings, but most of the new algorithms are useful for other simulations, and are now available in the FDPS framework.

[4] oai:arXiv.org:1903.03138 [pdf] - 1868141

A Mean-Field Approach to Simulating the Merging of Collisionless Stellar Systems Using a Particle-Based Method

Hozumi, Shunsuke; Iwasawa, Masaki; Nitadori, Keigo

Comments: 16 pages, 13 figures (14 figure files), accepted for publication in ApJ

Submitted: 2019-03-07

We present a mean-field approach to simulating merging processes of two spherical collisionless stellar systems. This approach is realized with a self-consistent field (SCF) method in which the full spatial dependence of the density and potential of a system is expanded in a set of basis functions for solving Poisson's equation. In order to apply this SCF method to a merging situation where two systems are moving in space, we assign the expansion center to the center of mass of each system, the position of which is followed by a mass-less particle placed at that position initially. Merging simulations over a wide range of impact parameters are performed using both an SCF code developed here and a tree code. The results of each simulation produced by the two codes show excellent agreement in the evolving morphology of the merging systems and in the density and velocity dispersion profiles of the merged systems. However, comparing the results generated by the tree code to those obtained with the softening-free SCF code, we have found that in large impact parameter cases, a softening length of the Plummer type introduced in the tree code has an effect of advancing the orbital phase of the two systems in the merging process at late times. We demonstrate that the faster orbital phase originates from the larger convergence length to the pure Newtonian force. Other application problems suitable to the current SCF code are also discussed.

[5] oai:arXiv.org:1810.11970 [pdf] - 1774889

PENTACLE: Parallelized Particle-Particle Particle-Tree Code for Planet Formation

Iwasawa, Masaki; Oshino, Shoichi; Fujii, Michiko S.; Hori, Yasunori

Comments: 12 pages, 14 figures, published in PASJ

Submitted: 2018-10-29

We have newly developed a Parallelized Particle-Particle Particle-tree code for Planet formation, PENTACLE, which is a parallelized hybrid $N$-body integrator executed on a CPU-based (super)computer. PENTACLE uses a 4th-order Hermite algorithm to calculate gravitational interactions between particles within a cutoff radius and a Barnes-Hut tree method for gravity from particles beyond. It also implements an open-source library designed for full automatic parallelization of particle simulations, FDPS (Framework for Developing Particle Simulator) to parallelize a Barnes-Hut tree algorithm for a memory-distributed supercomputer. These allow us to handle $1-10$ million particles in a high-resolution $N$-body simulation on CPU clusters for collisional dynamics, including physical collisions in a planetesimal disc. In this paper, we show the performance and the accuracy of PENTACLE in terms of $\tilde{R}_{\rm cut}$ and a time-step $\Delta t$. It turns out that the accuracy of a hybrid $N$-body simulation is controlled through $\Delta t / \tilde{R}_{\rm cut}$ and $\Delta t / \tilde{R}_{\rm cut} \sim 0.1$ is necessary to simulate accurately accretion process of a planet for $\geq 10^6$ years. For all those who interested in large-scale particle simulations, PENTACLE customized for planet formation will be freely available from https://github.com/PENTACLE-Team/PENTACLE under the MIT lisence.

[6] oai:arXiv.org:1804.08935 [pdf] - 1705276

Fortran interface layer of the framework for developing particle simulator FDPS

Namekata, Daisuke; Iwasawa, Masaki; Nitadori, Keigo; Tanikawa, Ataru; Muranushi, Takayuki; Wang, Long; Hosono, Natsuki; Nomura, Kentaro; Makino, Junichiro

Comments: 10 pages, 10 figures; accepted for publication in PASJ; a typo in author name is corrected

Submitted: 2018-04-24, last modified: 2018-04-25

Numerical simulations based on particle methods have been widely used in various fields including astrophysics. To date, simulation softwares have been developed by individual researchers or research groups in each field, with a huge amount of time and effort, even though numerical algorithms used are very similar. To improve the situation, we have developed a framework, called FDPS, which enables researchers to easily develop massively parallel particle simulation codes for arbitrary particle methods. Until version 3.0, FDPS have provided API only for C++ programing language. This limitation comes from the fact that FDPS is developed using the template feature in C++, which is essential to support arbitrary data types of particle. However, there are many researchers who use Fortran to develop their codes. Thus, the previous versions of FDPS require such people to invest much time to learn C++. This is inefficient. To cope with this problem, we newly developed a Fortran interface layer in FDPS, which provides API for Fortran. In order to support arbitrary data types of particle in Fortran, we design the Fortran interface layer as follows. Based on a given derived data type in Fortran representing particle, a Python script provided by us automatically generates a library that manipulates the C++ core part of FDPS. This library is seen as a Fortran module providing API of FDPS from the Fortran side and uses C programs internally to interoperate Fortran with C++. In this way, we have overcome several technical issues when emulating `template' in Fortran. By using the Fortran interface, users can develop all parts of their codes in Fortran. We show that the overhead of the Fortran interface part is sufficiently small and a code written in Fortran shows a performance practically identical to the one written in C++.

[7] oai:arXiv.org:1612.06984 [pdf] - 1580972

Unconvergence of Very Large Scale GI Simulations

Hosono, Natsuki; Iwasawa, Masaki; Tanikawa, Ataru; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro

Comments: Accepted to PASJ, an animation is available at https://vimeo.com/194156367

Submitted: 2016-12-21

The giant impact (GI) is one of the most important hypotheses both in planetary science and geoscience, since it is related to the origin of the Moon and also the initial condition of the Earth. A number of numerical simulations have been done using the smoothed particle hydrodynamics (SPH) method. However, GI hypothesis is currently in a crisis. The "canonical" GI scenario failed to explain the identical isotope ratio between the Earth and the Moon. On the other hand, little has been known about the reliability of the result of GI simulations. In this paper, we discuss the effect of the resolution on the results of the GI simulations by varying the number of particles from $3 \times10^3$ to $10^8$. We found that the results does not converge, but shows oscillatory behaviour. We discuss the origin of this oscillatory behaviour.

[8] oai:arXiv.org:1601.03138 [pdf] - 1422207

Implementation and performance of FDPS: A Framework Developing Parallel Particle Simulation Codes

Iwasawa, Masaki; Tanikawa, Ataru; Hosono, Natsuki; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro

Comments: 22 pages, 27 figures,accepted for publication in PASJ. The FDPS package is here https://github.com/fdps/fdps

Submitted: 2016-01-13, last modified: 2016-04-24

We present the basic idea, implementation, measured performance and performance model of FDPS (Framework for developing particle simulators). FDPS is an application-development framework which helps the researchers to develop particle-based simulation programs for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, redistribution of particles, and gathering of particle information for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as Barnes-Hut tree method should be used for long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are necessary. FDPS provides all of these necessary functions for efficient parallel execution of particle-based simulations as "templates", which are independent of the actual data structure of particles and the functional form of the interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N^2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speedup was obtained for up to the full system of K computer. The minimum calculation time per timestep is in the range of 30 ms (N=10^7) to 300 ms (N=10^9). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.

[9] oai:arXiv.org:1506.04553 [pdf] - 1176045

GPU-Enabled Particle-Particle Particle-Tree Scheme for Simulating Dense Stellar Cluster System

Iwasawa, Masaki; Zwart, Simon Portegies; Makino, Junichiro

Comments:

Submitted: 2015-06-15

We describe the implementation and performance of the ${\rm P^3T}$ (Particle-Particle Particle-Tree) scheme for simulating dense stellar systems. In ${\rm P^3T}$, the force experienced by a particle is split into short-range and long-range contributions. Short-range forces are evaluated by direct summation and integrated with the fourth order Hermite predictor-corrector method with the block timesteps. For long-range forces, we use a combination of the Barnes-Hut tree code and the leapfrog integrator. The tree part of our simulation environment is accelerated using graphical processing units (GPU), whereas the direct summation is carried out on the host CPU. Our code gives excellent performance and accuracy for star cluster simulations with a large number of particles even when the core size of the star cluster is small.

[10] oai:arXiv.org:1011.4017 [pdf] - 1042028

Eccentric evolution of SMBH binaries

Iwasawa, Masaki; An, Sangyong; Matsubayashi, Tatsushi; Funato, Yoko; Makino, Junichiro

Comments: 10 pages, 5 figures

Submitted: 2010-11-17

In recent numerical simulations \citep{matsubayashi07,lockmann08}, it has been found that the eccentricity of supermassive black hole(SMBH) - intermediate black hole(IMBH) binaries grows toward unity through interactions with stellar background. This increase of eccentricity reduces the merging timescale of the binary through the gravitational radiation to the value well below the Hubble Time. It also gives the theoretical explanation of the existence of eccentric binary such as that in OJ287 \citep{lehto96, valtonen08}. In self-consistent N-body simulations, this increase of eccentricity is always observed. On the other hand, the result of scattering experiment between SMBH binaries and field stars \citep{quinlan96} indicated no increase of eccentricity. This discrepancy leaves the high eccentricity of the SMBH binaries in $N$-body simulations unexplained. Here we present a stellar-dynamical mechanism that drives the increase of the eccentricity of an SMBH binary with large mass ratio. There are two key processes involved. The first one is the Kozai mechanism under non-axisymmetric potential, which effectively randomizes the angular momenta of surrounding stars. The other is the selective ejection of stars with prograde orbits. Through these two mechanisms, field stars extract the orbital angular momentum of the SMBH binary. Our proposed mechanism causes the increase in the eccentricity of most of SMBH binaries, resulting in the rapid merger through gravitational wave radiation. Our result has given a definite solution to the "last-parsec problem".

[11] oai:arXiv.org:1003.4125 [pdf] - 1025850

The origin of S-stars and a young stellar disk: distribution of debris stars of a sinking star cluster

Fujii, Michiko; Iwasawa, Masaki; Funato, Yoko; Makino, Junichiro

Comments: 10 pages, 5 figures, accepted for ApJL

Submitted: 2010-03-22, last modified: 2010-05-21

Within the distance of 1 pc from the Galactic center (GC), more than 100 young massive stars have been found. The massive stars at 0.1-1 pc from the GC are located in one or two disks, while those within 0.1 pc from the GC, S-stars, have an isotropic distribution. How these stars are formed is not well understood, especially for S-stars. Here we propose that a young star cluster with an intermediate-mass black hole (IMBH) can form both the disks and S-stars. We performed a fully self-consistent $N$-body simulation of a star cluster near the GC. Stars escaped from the tidally disrupted star cluster were carried to the GC due to an 1:1 mean motion resonance with the IMBH formed in the cluster. In the final phase of the evolution, the eccentricity of the IMBH becomes very high. In this phase, stars carried by the 1:1 resonance with the IMBH were dropped from the resonance and their orbits are randomized by a chaotic Kozai mechanism. The mass function of these carried stars is extremely top-heavy within 10". The surface density distributions of young massive stars has a slope of -1.5 within 10" from the GC. The distribution of stars in the most central region is isotropic. These characteristics agree well with those of stars observed within 10" from the GC.

[12] oai:arXiv.org:0807.2818 [pdf] - 314996

Trojan Stars in the Galactic Center

Fujii, Michiko; Iwasawa, Masaki; Funato, Yoko; Makino, Junichiro

Comments: 17 pages, 9 figures, accepted for publication in ApJ

Submitted: 2008-07-17, last modified: 2009-01-15

We performed, for the first time, the simulation of spiral-in of a star cluster formed close to the Galactic center (GC) using a fully self-consistent $N$-body model. In our model, the central super-massive black hole (SMBH) is surrounded by stars and the star cluster. Not only are the orbits of stars and the cluster stars integrated self-consistently, but the stellar evolution, collisions and merging of the cluster stars are also included. We found that an intermediate-mass black hole (IMBH) is formed in the star cluster and stars escaped from the cluster are captured into a 1:1 mean motion resonance with the IMBH. These "Trojan" stars are brought close to the SMBH by the IMBH, which spirals into the GC due to the dynamical friction. Our results show that, once the IMBH is formed, it brings the massive stars to the vicinity of the central SMBH even after the star cluster itself is disrupted. Stars carried by the IMBH form a disk similar to the observed disks and the core of the cluster including the IMBH has properties similar to those of IRS13E, which is a compact assembly of several young stars.

[13] oai:arXiv.org:0708.3719 [pdf] - 4313

Evolution of Star Clusters near the Galactic Center: Fully Self-consistent N-body Simulations

Fujii, M.; Iwasawa, M.; Funato, Y.; Makino, J.

Comments: 19 pages, 19 figures, accepted for publication in ApJ

Submitted: 2007-08-28, last modified: 2008-07-08

We have performed fully self-consistent $N$-body simulations of star clusters near the Galactic center (GC). Such simulations have not been performed because it is difficult to perform fast and accurate simulations of such systems using conventional methods. We used the Bridge code, which integrates the parent galaxy using the tree algorithm and the star cluster using the fourth-order Hermite scheme with individual timestep. The interaction between the parent galaxy and the star cluster is calculate with the tree algorithm. Therefore, the Bridge code can handle both the orbital and internal evolutions of star clusters correctly at the same time. We investigated the evolution of star clusters using the Bridge code and compared the results with previous studies. We found that 1) the inspiral timescale of the star clusters is shorter than that obtained with "traditional" simulations, in which the orbital evolution of star clusters is calculated analytically using the dynamical friction formula and 2) the core collapse of the star cluster increases the core density and help the cluster survive. The initial conditions of star clusters is not so severe as previously suggested.

[14] oai:arXiv.org:0801.0859 [pdf] - 8675

Evolution of Massive Blackhole Triples II -- The effect of the BH triples dynamics on the structure of the galactic nuclear

Iwasawa, Masaki; Funato, Yoko; Makino, Junichiro

Comments: Submitted to ApJ

Submitted: 2008-01-06

In this paper, we investigate the structures of galaxies which either have or have had three BHs using $N$-body simulations, and compare them with those of galaxies with binary BHs. We found that the cusp region of a galaxy which have (or had) triple BHs is significantly larger and less dense than that of a galaxy with binary BHs of the same mass. Moreover, the size of the cusp region depends strongly on the evolution history of triple BHs, while in the case of binary BHs, the size of the cusp is determined by the mass of the BHs. In galaxies which have (or had) three BHs, there is a region with significant radial velocity anisotropy, while such a region is not observed in galaxies with binary BH. These differences come from the fact that with triple BHs the energy deposit to the central region of the galaxy can be much larger due to multiple binary-single BH scatterings. Our result suggests that we can discriminate between galaxies which experienced triple BH interactions with those which did not, through the observable signatures such as the cusp size and velocity anisotropy.

[15] oai:arXiv.org:0706.2059 [pdf] - 1000421

BRIDGE: A Direct-tree Hybrid N-body Algorithm for Fully Self-consistent Simulations of Star Clusters and their Parent Galaxies

Fujii, M.; Iwasawa, M.; Funato, Y.; Makino, J.

Comments: 12 pages, 13 figures, Accepted for PASJ

Submitted: 2007-06-14, last modified: 2007-07-27

We developed a new direct-tree hybrid N-body algorithm for fully self-consistent N-body simulations of star clusters in their parent galaxies. In such simulations, star clusters need high accuracy, while galaxies need a fast scheme because of the large number of the particles required to model it. In our new algorithm, the internal motion of the star cluster is calculated accurately using the direct Hermite scheme with individual timesteps and all other motions are calculated using the tree code with second-order leapfrog integrator. The direct and tree schemes are combined using an extension of the mixed variable symplectic (MVS) scheme. Thus, the Hamiltonian corresponding to everything other than the internal motion of the star cluster is integrated with the leapfrog, which is symplectic. Using this algorithm, we performed fully self-consistent N-body simulations of star clusters in their parent galaxy. The internal and orbital evolutions of the star cluster agreed well with those obtained using the direct scheme. We also performed fully self-consistent N-body simulation for large-N models ($N=2\times 10^6$). In this case, the calculation speed was seven times faster than what would be if the direct scheme was used.

[16] oai:arXiv.org:astro-ph/0511391 [pdf] - 77749

Evolution of Massive Blackhole Triples I -- Equal-mass binary-single systems

Iwasawa, Masaki; Funato, Yoko; Makino, Junichiro

Comments: 20 pages, 12 figures

Submitted: 2005-11-14

We present the result of $N$-body simulations of dynamical evolution of triple massive blackhole (BH) systems in galactic nuclei. We found that in most cases two of the three BHs merge through gravitational wave (GW) radiation in the timescale much shorter than the Hubble time, before ejecting one BH through a slingshot. In order for a binary BH to merge before ejecting out the third one, it has to become highly eccentric since the gravitational wave timescale would be much longer than the Hubble time unless the eccentricity is very high. We found that two mechanisms drive the increase of the eccentricity of the binary. One is the strong binary-single BH interaction resulting in the thermalization of the eccentricity. The second is the Kozai mechanism which drives the cyclic change of the inclination and eccentricity of the inner binary of a stable hierarchical triple system. Our result implies that many of supermassive blackholes are binaries.