Matthias Troyer, from the ETH Zuerich, explained the whole acquisition process at the Supercomputer 2000 Conference in Mannheim. At the Physics Department in Zuerich, they are used to using a lot of computing power. Troyer also stayed for a longer period in Japan, where he had access to one of the largest supercomputers in the world, a Hitachi machine. Being used to have supercomputer access, he learned, when coming back to Zuerich, that the ETH planned to shut down the old Paragon parallel supercomputer they where still operating and had no plans for replacement. However, the ETH computer centre would support users when buying their own machines, as much as possible.
Apart from Troyer, also other physicists had a need for computing power. Hence the idea was born to look for a BeoWulf cluster. A Linux system with Ethernet interconnects would be able to fulfil their computational needs and because they where only a small group - and physicists - it would not be much of a problem to operate. Calculations showed, however, a big cluster would need special cooling. No problem at ETH where a conditioned computer room is available at the computing centre.
With Linux clusters, you have a choice to base it on the Alpha or on the Intel chip. ETH did some benchmarks and noted that in general Alpha's are faster but Intel offers better price performance. Hence if network speed is not important and you can do with Ethernet, Intels will do. Their applications, Monte Carlo simulations, series expansions and education, would work well with an Intel solution. In the benchmarks they also note that the Linpack performance benchmark, used in the TOP500, is irrelevant to their applications.
Because of the budget available and because ETH is a publicly funded organisation, they learned very soon, the procurement had to follow strict European regulations. Hence some legal advice was needed, which they got. A first step in the process, after writing a request for proposals, was publishing the tender officially in the Swiss "Handelsamtsblatt". This newspaper is not one that young twenty-year-old fast IT guys like the Dallman brothers normally read. Hence they would have missed it if someone had not drawn their attention to it. They got the paper work and saw they could build a cluster as required by the ETH. Two weeks later they delivered their offer which they had written with help of their German software partner SuSe.
A number of the other companies, small ones from other countries, did not qualify because they did not have support within the Swiss borders. Hence the only competitors of DalCo were the big computer companies. After several presentations and refining of the offer, DalCo was awarded the contract on October 14, 1999. On November 8, 1999, the contract was signed, and six weeks later the computer was already delivered to ETH. Before the end of the year the machine was up and running. In March, it was accepted. This machine consisted of 192 compute nodes with dual 500 MHz Pentium II chips and 1 Gigabyte of memory. In fact, the largest part of the cost of the machine is the memory. Unfortunately for ETH, the prices went considerably up during the whole acquisition phase. But overall, Troyer said, they did get a larger machine for their money than they anticipated. The best performance of this machine is 29.6 Gflop/s on one of their codes. The whole system, including the SuSe cluster software, runs smoothly, he said. Only bottleneck is the I/O file server.
Meanwhile, DalCo received the order to upgrade the machine to 500 processors by the end of June 2000. Probably further expansion will be needed as other departments also want to add their part to the Asgard cluster. So as the 'Dalcon brothers' show, it still only takes a few bright people to produce a supercomputer class system.