logo
EnterTheGrid - Primeur Live!

EnterTheGrid - Primeur is the premier Grid and Supercomputing information source in the world. With Primeur Live! it brings you Live reports from Europe's main Supercomputing and Grid events

>Primeur Magazine
>PrimeurLive!
>EnterTheGrid
>Analysis
>Backissues
>Calendar
>Subscribe
>Advertise
>Contact
Issue 27 June 2003
>Start
>A new design for supercomputers?
>Focus
>GRIA takes Grid computing into the real world
>It is hard work to keep up with people expecting us to follow Moore's law
>TOP500 supercomputing
>Off-the-shelf supercomputing is a dead end
>Interdependence of architecture and software for effective terascale computing
>Building a PetaFlops class machine for large scale system design experience and biomolecular simulation
>Exploring the benefits of FPGA-processor technology for genome analysis at Acconovis
>Twenty years experience at NAL with software for HPC in aerospace science and engineering
>Software for large-scale computing: it is scalability that matters!
>Can SuperData Centres be secured?
>Complexity of data in the passenger services systems of the DB AG
>Billing of million customers at German Telekom
>The Grid
>Taming huge data volumes
>Company news
>Rapidly evolving microprocessor technology turns throughput computing into alternative for HPC
>Dell introduces 64-Bit server for high-performance computing market
>Efficient network-storage, TCP processing and processor development under the loop at Intel
>AMD Opteron processor answer to tough challenges in high performance computing
Off-the-shelf supercomputing is a dead end
Heidelberg 26 June 2003 Currently most US supercomputers are built from off-the-shelf components. Evidence in 2002 piled up that this is a dead end, explained Horst Simon, director of NERSC at ISC2003 in Heidelberg. First there was the Earth Simulator, that acted like a Compunetik to the US supercomputing community, then there was a workshop on Petarange supercomputing, that showed no progress was made in supercomputing architecture during the past five years, and then there were the poor benchmark results on the newly available supercomputers. Supercomputers need to be designed with scientific applications in mind. First examples of co-operations towards this approach are "Red Storm" and "Blue Planet".
Advertisement
Advertisement
Visit our sponsors

NERSC, the US National Energy Research Scientific Computing Center, is one of the large research centres in the USA. It houses the number 5 supercomputer in the world and serves the whole US Department of Energy research community with a focus on large scale computing.

Simon explained the Earth Simulator should be a real catalyst for fundamental change in US science policy. It is not a single fast machine but the commitment of the Japanese government to invest in science-driven computing. As Kahaner explained in his talk at the same conference, Japanese are indeed focusing on science and applications, not on computer architecture and systems per se.

But what is very important, according to Simon, is to realise that the Earth Simulator is not a special purpose machine, hence, all US scientific computing communities are potentially now at a handicap of 10 to 100 in available computing capability. The importance of the Earth Simulator is not in its impressive peak performance, but in the sustained performance on real scientific problems. That allows Japanese scientific policy to build strategic partnerships in climate, nanoscience and fusion, moving to dominate simulation and modelling in many disciplines and just in climate modelling as the name Earth Simulator could imply.

The lesson learned here, said Simon, is that to optimise architectures for scientific computing, it is necessary to establish feedback between scientific applications and computer design over multiple generations of machines.

Designing new supercomputers and testing new architectures is out of fashion in the US. In the early 1990s Simon counted some 50 supercomputing relevant projects at American universities. Now, only a few are left. Virtually no one is looking at parallel languages and tools these days. Grid middleware and tools are getting all of the attention and resources.

As an example, Simon mentioned the latency numbers for interconnects. What is currently available is worse than that of the T3E in 1997! The bandwidth has hardly improved either.

The Power-4 chip for the new supercomputer at NERSC also reveals what Simon dubs the "Divergence problem". The chip does only perform 7% better than the Power-3 on scientific applications. Problem is the chip was not designed for scientific applications and the requirements for commercial and scientific computing are diverging. Problems are that the memory latency did not improve between the generations and the Power-4 does not scale well for more than 16 scientific tasks.

In Simon's view we have pursued the logical extreme of the commodity parts path. In the beginning of off-the-shelf computing, the commodity building block was the microprocessor - today it is an entire SMP server. Communications and memory bandwidth did not scale with processor power. And the size of systems is ever increasing: we have arrived at near football-field size computers consuming megawatts of electricity.

Simon did stress again that the requirements of high performance computing for science and engineering and the requirements of the commercial market are diverging. The commercial cluster of SMP approach does not provide the highest level of performance due to:

  • Lack of memory bandwidth
  • High interconnect latency
  • Lack of interconnect bandwidth
  • Lack of high performance parallel I/O
  • High cost of ownership for large scale systems

How should the problem be solved? Not the same way as in the 1990s with a lot of companies sparked by DARPA. The world has changed too much. But there is still a significant scientific market for high performance computing also outside of supercomputer centres. "For this new environment, we need a new, sustainable strategy for the future of scientific computing", Simon stated.

Ingredients of this new strategy are application teams to design new architectures; using current components and research prototypes into new architectures and continuous redesign and test prototypes in a vendor partnership to create new scientific computers. Development co-operations should bring together the interested computer vendors and scientists. As a result a feedback cycle should result between science and computer design lasting for generations of machines.

A first example of such a co-operative development is "Red Storm". Sandia National Laboratory and Cray are the partners of the project. Red Storm is an MIMD parallel supercomputer, a true MPP design with distributed memory. The machine will have 108 compute nodes with 10,368AMD compute node processors with a 20 Tflop/s peak. It will consume less than 2 MW of power.

Another example is Blue Planet, a collaboration of Simon's NERSC, ANL and IBM. The design goals of Blue Planet are based on scientific applications. The design will extend the Power technology and include "Virtual Vector Processing". Blue Planet is addressing the key barriers to effective scientific computing, including memory bandwidth and latency, interconnect bandwidth and latency, and programmability for scientific applications.

For IBM this is a divergence from their former strategy. For instance, Virtual Vectors were not planned as was decreasing switch latency. Blue Planet also requires a radical redesign of the company's software stack, Simon said.

Simon concluded by stressing again that business as usual will not preserve US leadership in advanced scientific computing. New computer architectures optimised for scientific computing are critical to enable 21st century science. In his view, US science requires a strategy to create cost-effective, science-driven computer architectures.
Advertisement
Dolphin's SCI interconnect features the lowest latency and wire speed
Advertisement
Visit our sponsors
Ad Emmen

EnterTheGrid - Primeur

James Stewartstraat 248

1325 JN Almere

The Netherlands

http://EnterTheGrid.com

mailto:primeur@hoise.com

© EnterTheGrid - Primeur Live!