sort results by

Use logical operators AND, OR, NOT and round brackets to construct complex queries. Whitespace-separated words are treated as ANDed.

Show articles per page in mode

Sterling, Thomas L.

Normalized to: Sterling, T.

2 article(s) in total. 6 co-authors, from 1 to 2 common article(s). Median position in authors list is 5,5.

[1]  oai:arXiv.org:astro-ph/9710212  [pdf] - 98971
Smooth Particle Hydrodynamics: Models, Applications, and Enabling Technologies
Comments: 12 pages, TeX, report of a workshop, held on June 18-19, 1997, at the Institute for Advanced Study, Princeton
Submitted: 1997-10-20
We present the results from a two-day study in which we discussed various implementations of Smooth Particle Hydrodynamics (SPH), one of the leading methods used across a variety of areas of large-scale astrophysical simulations. In particular, we evaluated the suitability of designing special hardware extensions, to further boost the performance of the high-end general purpose computers currently used for those simulations. We considered a range of hybrid architectures, consisting of a mix of custom LSI and reconfigurable logic, combining the extremely high throughput of Special-Purpose Devices (SPDs) with the flexibility of reconfigurable structures, based on Field Programmable Gate Arrays (FPGAs). The main findings of our workshop consist of a clarification of the decomposition of the computational requirements, together with specific estimates for cost/performance improvements that can be obtained at each stage in this decomposition, by using enabling hardware technology to accelerate the performance of general purpose computers.
[2]  oai:arXiv.org:astro-ph/9704183  [pdf] - 97145
GRAPE-6: A Petaflops Prototype
Comments: LaTeX, 16 pages, to appear in the proceedings of the 1997 Petaflops Algorithms Workshop (PAL'97), held on April 13-18, 1997 in Williamsburg, Virginia
Submitted: 1997-04-17
We present the outline of a research project aimed at designing and constructing a hybrid computing system that can be easily scaled up to petaflops speeds. As a first step, we envision building a prototype which will consist of three main components: a general-purpose, programmable front end, a special-purpose, fully hardwired computing engine, and a multi-purpose, reconfigurable system. The driving application will be a suite of particle-based large-scale simulations in various areas of physics. The prototype system will achieve performance in the $\sim 50 - 100$ teraflops range for a broad class of applications in this area. The combination of a hardwired petaflops-class computational engine and a front end with sustained speed on the order of 10 gigaflops can produce extremely high performance, but only for the limited class of problems in which there exists a single bottleneck with computing cost dominating the total. While the calculation for which the Grape-4 (our system's immediate predecessor) was designed is a prime example of such a problem, in many other applications the primary computational bottleneck, while still related to an inverse-square (gravitational, Coulomb, etc.) force, requires less than 99% of the computing power. Although the remainder of the CPU time is typically dominated by just one secondary bottleneck, its nature varies greatly from problem to problem. It is not cost-effective to attempt to design custom chips for each new problem that arises. FPGA-based systems can restore the balance, guaranteeing scalability from the teraflops to the petaflops domain, while still retaining significant flexibility. (abbreviated abstract)