Full-text search for arXiv

2 article(s) in total. 6 co-authors, from 1 to 2 common article(s). Median position in authors list is 5,5.

[1] oai:arXiv.org:astro-ph/9710212 [pdf] - 98971

Smooth Particle Hydrodynamics: Models, Applications, and Enabling Technologies

Hut, Piet; Hernquist, Lars; Lake, George; Makino, Jun; McMillan, Steve; Sterling, Thomas

Comments: 12 pages, TeX, report of a workshop, held on June 18-19, 1997, at the Institute for Advanced Study, Princeton

Submitted: 1997-10-20

We present the results from a two-day study in which we discussed various implementations of Smooth Particle Hydrodynamics (SPH), one of the leading methods used across a variety of areas of large-scale astrophysical simulations. In particular, we evaluated the suitability of designing special hardware extensions, to further boost the performance of the high-end general purpose computers currently used for those simulations. We considered a range of hybrid architectures, consisting of a mix of custom LSI and reconfigurable logic, combining the extremely high throughput of Special-Purpose Devices (SPDs) with the flexibility of reconfigurable structures, based on Field Programmable Gate Arrays (FPGAs). The main findings of our workshop consist of a clarification of the decomposition of the computational requirements, together with specific estimates for cost/performance improvements that can be obtained at each stage in this decomposition, by using enabling hardware technology to accelerate the performance of general purpose computers.

[2] oai:arXiv.org:astro-ph/9704183 [pdf] - 97145

GRAPE-6: A Petaflops Prototype

Hut, Piet; Arnold, Jeffrey M.; Makino, Junichiro; McMillan, Stephen L. W.; Sterling, Thomas L.

Comments: LaTeX, 16 pages, to appear in the proceedings of the 1997 Petaflops Algorithms Workshop (PAL'97), held on April 13-18, 1997 in Williamsburg, Virginia

Submitted: 1997-04-17

We present the outline of a research project aimed at designing and constructing a hybrid computing system that can be easily scaled up to petaflops speeds. As a first step, we envision building a prototype which will consist of three main components: a general-purpose, programmable front end, a special-purpose, fully hardwired computing engine, and a multi-purpose, reconfigurable system. The driving application will be a suite of particle-based large-scale simulations in various areas of physics. The prototype system will achieve performance in the $\sim 50 - 100$ teraflops range for a broad class of applications in this area. The combination of a hardwired petaflops-class computational engine and a front end with sustained speed on the order of 10 gigaflops can produce extremely high performance, but only for the limited class of problems in which there exists a single bottleneck with computing cost dominating the total. While the calculation for which the Grape-4 (our system's immediate predecessor) was designed is a prime example of such a problem, in many other applications the primary computational bottleneck, while still related to an inverse-square (gravitational, Coulomb, etc.) force, requires less than 99% of the computing power. Although the remainder of the CPU time is typically dominated by just one secondary bottleneck, its nature varies greatly from problem to problem. It is not cost-effective to attempt to design custom chips for each new problem that arises. FPGA-based systems can restore the balance, guaranteeing scalability from the teraflops to the petaflops domain, while still retaining significant flexibility. (abbreviated abstract)