|
Primeur: Can you please give a short description of your configuration?
Prof. Lippert: We use 128 Compaq - now Hewlett-Packard - DS10 single processor workstations. The Alpha 21264 EV67 is clocked with 616 MHz. There are 256 MB ECC memory. The system has a peak performance of 157 GFlop/s. The Linpack performance was 99 GigaFlop/s and rank 210 in the Top500 November 2000 list and one year later with rank 369 with 108 GFlop/s. The workstations are connected via the high-speed Myrinet network.
Primeur: Professor Lippert can you give a short overview on the usage of ALiCE and your experience with this cluster ?
Prof. Lippert: We have a very heterogeneous user profile, there is the production with programmes from particle physics, material sciences and scientific computing/numerical mathematics. On the other hand a lot of students use this cluster within their computer lab courses. Thus we developed the concept of a virtual machine. The virtual partitions for the parallel jobs are distributed dynamically on the physical processors. During the daytime up to 10 working groups run their programmes in parallel. Therefore, automatic control of the machine is an absolute must. In this environment it is not possible to purge parts of killed or died jobs by hand.
Primeur: Do you use open systems for this purpose?
Prof. Lippert: From the beginning on, we concentrated on the ParaStation Cluster Middleware, which was developed by the Institute for Programmstrukturen und Datenorganisation (Program structures and Data organisation) at the University Karlsruhe - Professor Tichy. Today this software is commercialised and is distributed by ParTec in Munich. ParaStation has components for optimal interprocessor communications and cluster management. We use OpenPBS as the portable batch system, which co-operates via interface directly with ParaStation.
Primeur: You use the high-speed interconnect Myrinet, which is accompanied by its own communication software GM. Why do you use ParaStation instead and where are the benefits?
Prof. Lippert: The generic Myrinet software in the past in big systems - like ours - was not always stable. ParaStation offers a very stable and optimised communication library for MPI (Message Passing Interface) for Fortran, C and C++. Compared with GM the bandwidth is higher, in particular in the range of 8 to 16 KByte packages, and furthermore, ParaStation has a very low latency.
Another aspect is much more important in view of availability and reliability: in parallel to the error correction within the transmitted packages itself, ParaStation assures that the packages really arrive at the receiving node. If the package is lost, it automatically is resent. As we work in the asynchronous mode, overlapping of computation and communication, a package loss is critical. In that case the program might use the wrong packages and thus wrong data - delivering wrong results.
Primeur: How do you manage the cluster?
Prof. Lippert: Here the stability is guaranteed by the ParaStation middleware. With one command only, we know which applications and which processes run on the cluster. From each node we can stop or finish processes and applications. If there occurs a hardware problem the failing processor is automatically excluded. ParaStation reinitialises virtual sub-partitions of processors without any influence on the rest of the system. It clears the partition or nodes automatically if there is a regular or irregular process crash. This possibility allows us a stable management of a cluster of this size in multi-user operation as well as for user groups with dedicated partitions.
Primeur: This means the selection of middleware is crucial for the operation of a cluster ?
Prof. Lippert: An integrated system with an optimal communication plus scheduling plus control and supervision of the cluster is absolutely necessary. Most of the starters in cluster computing do not know that this and only this allows a stable multi-user operation.
Primeur: In the beginning you mentioned that you increased the GFlop/s rate of the Top500 from 99 GigaFlop/s Linpack performance to 108 GFlop/s. Did you optimise and hand code the Linpack?
Prof. Lippert: The improvement was achieved by the new ParaStation3, the old result we got running ParaStation2. 10% of improvement were achieved just by communication software.
Primeur: What about crashes of Linux or the Alpha processors?
Prof. Lippert: As usual we had minor problems at the beginning of the operations. But today Linux and Alpha is not a question. More than one half of the machine runs without booting since one year. This is true for the communication too. With GM we had to boot the system, if a communication error occured.
Primeur: You want to acquire AliCEnext, the Advanced-Linux-Cluster-Engine - next generation, with 1024 processors. What are your plans?
Prof. Lippert: The cluster should consist of 1024 processors, Intel Xeon or Pentium 4. With double precision our programmes are extremely fast, as we can use the SIMD (Single Instruction Multiple Data) instructions. I expect a factor of 2 to 3 compared to 64 bit processors like Itanium 2. The system will be connected via a switched Gbit Ethernet. We plan to install an additional 2-D mesh for fast nearest-neighbour communication. ParaStation supports Gbit Ethernet. In our applications we achieve a latency and bandwidth comparable to Myrinet. In the very near future ParaStation will support Infinband. The new cluster unites two different paradigms. On the one hand the cluster is a high-performance system for about 60 to 70% of its capacity. On the other hand it is operated as a high-throughput system in a Grid-connection for data intensive problems in experimental and theoretical particle physics. Both operation modes can be realised efficiently. The data storage and fast I/O required for high-throughput operations will be achieved with a parallel file system. Here we expect a total of 200 TByte. This means, one can view the cluster as a Storage Area Network, a SAN. The single nodes with their disks can be combined to a huge, coherent disk storage. This is an interesting new aspect in cluster computing.
Primeur: Thank you Professor Lippert for this discussion
ALiCE
ParaStation at Univ. Karlsruhe
Commercial ParaStation . |