Computer contract signed for High-Performance Center Bavaria - first TFlop computer in Europe

Munich 02 Nov 99 On October 29, the contract between Bavarian Academy of Sciences (founded in 1759) - Leibniz Rechenzentrum (LRZ) - and Hitachi was signed in Munich in the rooms of the Academy. Primeur had the chance, to participate in this historic event. A first two node system, used for training, will be delivered this year. The SR8000 F1 with 112 nodes will be shipped in the first quarter of 2000, the last stage will be realised in 2002 with 168 nodes. The system in its first stage will deliver 1.34 TFlop/s and in 2002, 2 TFlop/s peak performance. LRZ expects, based on benchmark experiences, an application performance of more than 400 GFlop/s and more than 600 GFlop/s in 2002. Some details of the configuration are summarised. Signing the Contract On October 29 at 7 p.m. Professor Heinrich Noeth, President of the Bavarian Academy of Sciences, Professor H.-G. Hegering, Head of Leibnizrechenzentrum, and Hiroaki Nakanishi, President Hitachi Europe, signed the contract for a Hitachi SR8000 F1 system for LRZ. This is the new workhorse for the High-Performance Computing Center Bavaria (HLRB) which complements the centers in Juelich, John von Neuman Center for Computing, and HLRS, the Stuttgart center. The next German center is scheduled for Northern Germany.

Hardware SR8000 F1

Compared to the actual SR8000 with 8 GFlop/s peak performance per node, the F1 delivers 12 GFlop/s - a 50% increase, 1.5 GFlop/s per processor. A node contains physically 9 processors although only 8 are used by the compiler for automatic parallelisation.. With its 112 nodes the system has a peak performance of 1.344 GFlop/s which makes it to the first TFlop computer in Europe. In 2002, 2 TFlop/s using 168 nodes are scheduled.

A node has 8 GByte memory, about 6.5 GByte can be used by a user. Four nodes of the whole system have 16 Gbyte each. This sums up to .928 TByte. In the last stage 1.344 TByte are available with the same node constraints. The node contains Hitachi proprietary RISC processors, which have been developed by the company itself.

The disk space sums up to 7.4 TByte (5.3 TByte user) in the first and then to 10 TByte (7.1 TB user).

As the SR8000 is a homogeneous system, and has the same architecture in both phases, no program modifications are necessary in phase 2. Reliability is an other important issue, as Hitachi officials mentioned during the contract ceremony. The target is 180 000 hours mean time between failure per node - this means 20 years/failure. For the total system this means about 2 months. The mean time to repair is targeted as one hour for node replacement.

The nodes are connected via a three-dimensional crossbar with a bandwidth of 1 Gbyte/s bidirectional between nodes and a latency of 19 microseconds.

Programming models

The innovative architecture of SR8000 allows the usage as a vector processor and as the scalar SMP-cluster programming model within one machine. A vectorisable operation is distributed on the floating point unit of the 8 processors within a node. This is called COMPAS, Cooperative Micro Processors in single Address Space, by Hitachi. The compiler distributes the data, the synchronisation is realised by hardware. An other feature is PVP, Pseudo Vector Processing. By prefetching of data and storing it into the cache that is needed in a loop, the access time is reduced.

Software and tools

The operating system is POSIX.1003.2-based Unix, HI-UX/MPP. Compilers are C, Fortran 77, Fortran 90, C++ (as a precompiler to C, the native compiler will be available 1.Q 2002). TotalView is the debugger.

Tools and libraries for parallel programming are OpenMP 1.0 supported by Fortran and C++. MPI 2, fully implementation, parallel I/O, dynamic process generation between nodes as well as intra-node (MPP), PVM 3.3.10, High-Performance Fortran Vers. 2.0 and Linda, mid 2000.

A very important tool for the profiling of MPI-programs, VAMPIR by Pallas GmbH in Bruehl, is scheduled for 3. Q 2000. Furtheron Hitachi will deliver for SR8000 F1 optimised versions of BLAS, LAPACK, ScaLAPACK und NAG, the Hitachi proprietary library MATRIX/MPP, subroutines for linear algebra, fast Fourier Transforms and random numbers. Huge sparse matrices will be supported.

A two node system for training purposes will be delivered this year.

 


Uwe Harms

[News on Advanced IT]   [Calendar]   [Analysis]   [IT in Medicine]