Applications with petaflop/s needs

Heidelberg 15 July 2002 In the last three articles, I focused mainly on hardware developments for Petaflop/s computing and the network Grid infrastructure. In this article I briefly report on requirements of applications in need of petaflop/s. (Chris Lazou)

Two trends seem to be converging at present. Vendors are busily extending the hardware capability towards petaflop/s and the Grid infrastructure is trying to extend capability and also tap the "potentially" many petaflop/s cycles, which are lost because computers are idle, part of the time.

As I reported two weeks ago, from Dona Crawford's presentation, terascale-plus simulation plays a key role in virtually every LLNL programme. In physics and biology, materials modelling, drug design, environmental global climate and groundwater flow, where experiments are impractical, in nuclear weapons stockpile stewardship, radiation transports and hydrodynamics, where experiments are prohibited by international treaties, engineering structural dynamics, electromagnetic applications, Laser & energy combustion, where experiments are too expensive. In all these areas simulation challenges theory and experiment, to refine quality and accuracy.

In his presentation: "Preparing for Petaflop/s", Rick Stevens, from the Argonne National Laboratory, University of Chicago, listed fifteen different application areas including nano-scale devices, virtual reality, modelling complex scenes, cryptography, digital signal processing, symbolic and experimental mathematics, general societal problems, modelling of complex transportation, communication, business operations and economic systems, which could benefit from petaflop/s and beyond.

He went on to say: "In the early 1990s discussions began in the United States (and perhaps elsewhere) to consider the technology paths that could lead to the development of petaflop/s computers, systems capable of sustaining over one quadrillion floating-point operations per second. The essential results of these discussions suggest that many applications domains could effectively use petaflop/s capabilities, and these applications domains have a variety of memory and turnaround requirements. Petaflop/s-capable machines were likely to result from the natural course of Moore's law within 20 years (~2015); however, rapid paths could be envisioned that might achieve petaflop/s as early as 2007.

By the mid nineties, it was recognised that a number of architectural paths could lead to petaflop/s capability; however, it was not possible to predict in detail which architectures were most likely to succeed. It was also easy to envisage that to get there, any computing system based on known technology, would require significant degrees of concurrency of at least a million processors. This is reflected in the IBM Blue Gene project.

The main requirements for petaflop/s computing were identified as: Memory and cache footprints, i.e. the amount of memory required at each level of the memory hierarchy; the degree of data reuse associated with core kernels of the application; the scaling of those kernels; and the associated estimate of memory bandwidth required at each level of the memory hierarchy. In addition, the instruction mix required by the application, the I/O speed and secondary storage needed for intermediate results or checkpoints, the concurrency available in the application, the communications requirements, such as, bisection bandwidth, latency, fast synchronisation patterns, also play an important role.

In many cases the applications analysis can be reduced to understanding the memory bandwidth requirements for kernel algorithms and the scaling properties of these core kernels, i.e., how do memory capacity and bandwidth requirements scale, as problem size increases. Traditional scientific applications areas such as general circulation models, quantum chromo-dynamics and fluid dynamics in astrophysics have relatively well understood requirements.

These applications areas also have significant scalability with computational complexity growing faster than memory requirements as the problem scales, implying that sustained petaflop/s performance could be supported by a memory system that is significantly smaller than one petabyte. Other applications areas such as data mining and decision support have a much higher need for memory. Thus, the relative importance of different types of application modalities is an important consideration in determining feasible design points for petaflop/s systems.

Understanding the memory bandwidth and communications requirements (bisection bandwidth and latency) of existing codes is possible by instrumenting codes and by analysing the algorithms and their implementation. Also important is the degree of concurrency available both for scaled versions of existing applications and for new applications. Of particular interest are the types of concurrency that might be available to a compiler and what might be explicitly expressed via a particular programming model.

Another important area is systems software (e.g., compilers, operating system and libraries). For petaflop/s-capable systems this was and still remains a serious barrier to achieving both sustained performance and widespread use.

The traditional supercomputing application scenarios assume that the primary use of petaflop/s systems will be for larger, more complex runs of existing applications; turnaround is assumed to be relatively constant. These include molecular dynamics, quantum chemistry, bio-molecular modelling, computational geophysics, and climate modelling. The fundamental quantum chemical equations were known for most of a century and yet only a limited use for real world studies were made because of their immense complexity. Computational quantum chemistry is now feasible. This brings biological phenomena, which span immense scales of time from femto-seconds to billions of years within our sights. But one has to moderate this knowing that exaflop/s only scratch the surface of potential for biological modelling of say protein folding.

An alternative involves accelerating fixed-size problems from non-real-time to real-time or faster than real-time status. These models may then form the core of human-in-the-loop applications, for solving complex design problems. Examples here include data visualisation, proof finding by automated deduction, data mining in sociology, interactive programming, and analysis using tools like Mathematica and Matlab. The key attribute of this category is that the human is in the loop and the problem-solving pattern involves a substantial amount of human interaction with the computer and the utilised algorithms.

Although this category is barely making use of existing large-scale computers, many of the future scientific and social impacts of computing are likely to come from this segment of the community. The potential benefit of petaflop/s computing to this category of user is immense, ranging from increasing the scale at which interactive computational experiments can be conducted to reducing the time from prototype development to widespread use.

Supporting existing programming models on petascale systems may not be possible, but it is worth understanding at what scale or performance regime current programming models prove inadequate. In some cases existing programming models can be supported on proposed petascale hardware, but with some loss of optimality. Scalability test-beds may prove to be critical in reducing the adoption lag. Argonne is developing such a scalability test-bed designed to support systems software development and applications software development up to 100,000 virtual nodes. This test-bed will be available to the community for experimentation, with the goal of accelerating the ability of the community to exploit high-scale machines in the future.

He concluded by saying: "Petaflop/s systems are likely to be available in the next decade and that some of these systems will hold for another two or three orders of magnitude performance during the second decade of the twenty-first century. Unlike today's situation with petaflop/s, however, where we can imagine extending teraflops architectures and methodologies to address the requirements and challenges of petaflop/s, extensions of the techniques developed for petaflop/s will not enable us to effectively use exaflop/s systems. Exaflop/s require fundamentally new ways of thinking about computing systems, largely because of their enormous scale and complexity, in expressing and managing the required concurrency."

Another fascinating presentation on applications, was given by Hans-Christian Hege, ZIB / Indeed-Visual Concepts, Berlin, Germany, titled: "Visualisation and Supercomputing: from Atoms to Galaxy Clusters".

Supercomputers allow simulations, which cover long time scales in nature, complementing or even substituting real measurements. In order to provide real insight, huge amounts of simulation data have to be presented in a comprehensible way, by exploiting the highly developed human sensory and perceptual capabilities. Therefore, especially in supercomputing, visualisation has become extremely important.

The development of effective and efficient visualisation techniques is a major challenge, in scientific visualisation. Revealing interesting aspects of data often requires a perfect blend of techniques from data analysis, computer graphics, and visual perception. In supercomputing the focus is on techniques that allow an exploration of large data sets.

In this presentation, stunning visualisation examples were shown from application fields, ranging from atomic physics, chemistry, biochemistry and medicine, to fluid dynamics, geo- and astrophysics - a virtual trip through many fields; examples included the brain of a fruit fly, osteoporosis bone structure, facial surgery bone cuts, liver tumour removal, crystalline structures, and the relativistic birth of the first star. It went on to show protein formation and its surrounding electrical field and energy levels, as well as investigations of many other of its properties.

The interactive presentation employed the 3D visualisation system AmiraVR and stereoscopic projection. The emphasis was on state-of-the-art visualisation techniques that utilise graphics hardware for interactive display of very large data sets.

Next week I'll report on one of the most thought provoking presentations titled: "Genetic engineering - the utopian idea of perfect human beings" given by Professor Jens G. Reich, Max-Delbrück Center of Molecular Medicine, Humboldt University Medical Faculty (Charité) Department of Bio-informatics, Berlin, Germany. The ethical and philosophical ramifications of genetics would be discussed.


Chris Lazou

[News on Advanced IT]   [Calendar]   [Analysis]   [IT in Medicine]