New supercomputer marks the beginning of a new HPC era in Northern Germany

Berlin 22 March 2002 Also in Berlin now, the contract for the new Northern-German supercomputer has been signed. The total 4 Tflop/s supercomputer consists of two parts, one half in Hannover, of which the contract was signed last week, and one half in Berlin at the Zuse Institute Berlin (ZIB). The machine is financed by a collaboration between the six countries in the north of Germany. Primeur/EnterTheGrid talked to Alexander Reinefeld, director of Zib. In his view, this machine marks the end of the old traditional supercomputers, and the transition to an era where there is not that much difference between supercomputer hardware and that of powerful standard computers. With two software projects in collaboration with the manufacturer, that will tightly connect the two parts of the machine in a seamless way, ZIB is helping to speed up this transition process in software as well. But, although the machine will also be used for testing new Grid technologies the main purpose of the supercomputer is to be used as a production facility for researchers in Germany with large compute hungry programmes. According to Reinefeld, the various fields of physics, chemistry and engineering, climate modelling and life sciences are the main application areas.

The whole acquisition process took a long time and was rather complicated. With a distribution over two sites, that poses legal challenges, a choice for one manufacturer that was challenged in court by another. The least of the problems was the co-operation between the six countries. Although the machine is physically installed in only two of those, the six work together very well, Reinefeld says.

The whole acquisition process will be explained at the Heidelberg Supercomputer Conference, and at that time Primeur/EnterTheGrid will devote an article to it.

The 4 Tflop/s machine with 768 processors, consists of 24 166 Gflop/s SMP nodes. Each of which already has onethird of the performance of the Cray T3E which is currently the largest supercomputer in Northern Germany. The network interconnect of the Cray, is, however, still faster. It will only be after an ugrade next year, that the new machine will be able to match that.

Most programmes do not need all of the processors. But, of course, it only makes sense to buy such a large single system if you have codes that can make use of it. For instance in Amsterdam, it took a great deal of effort and time before there were programmes identified that could run on all 1000+ processors of the supercomputer there. In Berlin/Hannover, Reinefeld sees for instance applications in Quantum Chromodynamics,Quantum Chemistry and Cliamte and Ocean moddelling that could use the machine to its full extent. But in the beginning, there will also be time for testing, as for instance in the DataGrid Quest. But, Reinefeld stresses, utilisation of the machine by large jobs, too large for smaller local machines, is the main purpose of the machine. The current policy in Germany to have several supercomputers with diverse system architectures, in for instance also Jülich, Stuttgart and Munich is a good one. There is no real need for one sinle, much bigger national machine.

Letting the two parts working as one, is a main concern. There is a 2 Gbit/s dedicated line that connects both machines. There will be a joint project with the vendor on scheduling: the job management systems on both sites will be connected so that there is a global view of the system.

Next project part will be to modify the MPI much the same way MPI-PACS is used for tightly connecting Cray systems. At the end of the project, the software will be integrated in the standard vendor offerings and be made available to other supercomputer sites too.

A problem with such a big machine or machines, is the licenses. Software licenses are expensive, and the normal pricing strategy: pay per processor in the machine does not really work in this case. Negotiations with ISV's are underway to find acceptable solutions.

This machine marks the end of supercomputers as we know them. The good old days in which one computer ran one copy of the operating system, are over. On the new super, a copy of the operating system is run on each partition. From the technical point of view, the new Berlin/Hannover machine is a cluster of SMP machines. From the usage point of view, there still is, of course, a big gap between clusters and supercomputers.

The new super can be used as a node in the Grid, Reinefeld said. Globus is installed on it. But it is a production machine, and as such much too expensive. The large US labs can afford giving supercomputer time away for free. Here in Europe, we can not.

Reinefeld sees that clusters and Grids are a natural marriage. Clusters are ideal computing engines in the Grid. They have the same kind of utilisation scenarios and the sofware fits natuarally together.

In Berlin, apart from the new superomputer, they are also investing in clusters. For instance, there is a plan to install a cluster for seven bioinformatic groups collaborating in the Berlin centrefor genome based research. That cluster will be a node in the Grid right from the start.

Of course that is also why Reinefeld organises the ccGrid conference this year in Berlin. Also at the Heidelberg Supercomputer Conference in June, he is organising a special tutorial on Clusters and Grids.


Ad Emmen

[News on Advanced IT][Calendar][Analysis][IT in Medicine]