PrimeurLive! from the Heidelberg Supercomputer Conference, ISC2002, June 2002

The Mannheim Supercomputer Seminar is the main HPCN event in Europe. This year we publish two live issues from the event:

Contents of PrimeurLive!:

Advertisement

Advertisement
Scali

Advertisement
Scali

Advertisement
Sun

Advertisement
Platform




Rick Stevens, Argonne National Laboratory and the University of Chicago discussed different approaches to reach petaflops performance

Heidelberg 22 jun 2002 In the early 1990s discussions began in the United States to consider the technology paths that could lead to the development of petaflops computer systems capable of sustaining over one quadrillion (1015) floating-point operations per second. The results of these discussions were well documented in a book and in a series of workshop reports (PetaFLOPS Workshop Series 1994-1999).

Several issues were suggested:

  • Applications: these applications domains have a variety of memory and turnaround requirements
  • Petaflops-capable machines are expected within 20 years (~2015): rapid paths could be envisioned, to achieve petaflops as early as 2007
  • Multiple architectural paths could lead to petaflops capability
  • According to Little's law, any petaflops systems requires significant concurrency (106-109)
  • Systems software (e.g., compilers, operating system and libraries) are a serious barrier to achieve sustained performance and widespread use

First, he discussed the applications for Petaflops, from science (including computer science), engineering, policy, national security, business, and entertainment. They require memory and cache footprints, a high degree of data reuse, the scaling of application kernels, and high memory. There are high I/O requirements and secondary storage for intermediate results or checkpoints. There must be an appropriate amount of concurrency in the application, and communications requirements. Often, the applications analysis needs to understand the memory bandwidth requirements for kernel algorithms and the scaling properties of these core kernels.

A major factor pacing the adoption of future large-scale computing systems is the movement of existing work to new platforms. Experiences showed that a decade is needed to embrace a new programming model (e.g., vectors, massively parallel processors). If petaflops systems require new programming models, then there will be a substantial lag in, while new codes are developed or implementations of existing codes are modified to incorporate the new programming models. Most communities in the United States abandoned vector processing in exchange for message passing with superscalar/cache-based kernels.

Occasionally these kernels have multithreaded versions for mixed-mode systems, or clusters of SMPs. But even on such systems, many (if not most) applications will use a pure message-passing model. To reduce the adoption lag, it was proposed to support existing message-passing programming models on petascale systems so that the MPI codes can be ported straightforward. Another way is to use scalability testbeds to virtually test prior to the availability of the hardware.

The support of existing programming models on petascale systems may not be possible, but it is worth understanding when current programming models prove inadequate. Scalability testbeds may prove to be critical in reducing the adoption lag. Argonne is developing such a scalability testbed designed to support systems software development and applications software development up to 100,000 virtual nodes. This testbed will be available to the community for experimentation, with the goal of accelerating the ability of the community to exploit high-scale machines in the future.

Rick Stevens mentioned that many potential applications for petaflops and petaops computing systems can be imagined. But some communities are well prepared to adopt new architectures and others are much less prepared. In the aggressive category he placed those who worked for some time at the frontiers of high-end computing (e.g., astrophysics, cosmology, QCD). Other users belong to the high-throughput category. They have well-defined computational and data analysis problems from fields such as electronic circuit design, bioinformatics, MCAD, ECAD, design optimisation, chemical engineering, and medical imaging.

Often they lack highly scalable algorithms or even implementations that are available for ongoing development. They can benefit by accelerating existing problems by several orders of magnitude that might dramatically alter the pattern of development or problem solving. On the other hand petaflops technology may enable inexpensive teraflops for these applications.

He presented the exploratory computing category, which includes users with applications for rapid prototyping of ideas or algorithms, data visualisation, proof finding by automated deduction, data mining in sociology, interactive programming, and analysis using tools like Matlab and Mathematica. Ironically, although this category is only barely making use of existing large-scale computers, many of the future scientific and social impacts of computing are likely to come from this segment of the community.

They can benefit of petaflops/petaops ranging from increasing the scale at which interactive computational experiments can be conducted to reducing the time from prototype development to widespread use. For example, if a petaflops system can be used to enable a very-high-level language-interpreted environment to perform at the same rate as a dedicated teraflops system, then it can be used to rapidly test ideas at full performance levels that then could be deployed on lower-cost platforms. The ability to support time-shared access to petascale systems is an important new design consideration.

He believes that the development of petaflops-capable systems is possible within the next five to six years, very likely by the end of the decade, and a certainty by 2015. While there are many unsolved technical problems associated with the approach of simply scaling today's terascale systems to this level of performance, most of these problem appear solvable.

A major concern for achieving broadly usable petaflops systems is scaling existing systems software environments to the 10 exp(5) - 10 exp(6 ) node scale. Researchers are investigating approaches that reduce the need to support a significant fraction of today's programming environment, but such approaches must be evaluated in the context of applications requirements. Programming models for petascale systems also remain a major concern because of the long adoption curve for applications communities and the cost of retargeting applications codes. If historical trends hold for the future, it probably will take a decade or more for a significant fraction of the applications to be adopted to a new programming model for petaflops systems, assuming that model is not a continuous or straightforward extension of today's programming models.


Uwe Harms

[News on Advanced IT][Calendar][Analysis][IT in Medicine]