NEC announces world's fastest supercomputer - The SX-6 Series
London 03 October 2001
On October 3rd, NEC announced that their latest supercomputer, "The SX-6 Series," is on sale worldwide. The SX-6 Series is a parallel vector processor system with peak vector performance of up to 8 Teraflop/s, the fastest supercomputer available for technical computing in civilian use. In this contribition, Chris Lazou reviews and comments on the SX-6. The keywords which spring to mind are performance, capacity, reliability, future development path and Teraflop/s for the user.
On October 3rd, NEC announced that their latest supercomputer, "The SX-6 Series," is on sale worldwide. The SX-6 Series is a parallel vector processor system with peak vector performance of up to 8 Teraflop/s, the fastest supercomputer available for technical computing in civilian use.
The high throughput performance of the SX-6 Series is achieved by employing up to 8 TeraBytes of ultra-high-speed Dynamic Read Access Memory (DRAM) with transfer rates of up to 32 TeraByte/s and a 1 TeraByte/s inter-node crossbar switch. This combination of state-of-the-art technology, delivers higher sustained performance than scalar computation machines.
The SX-6 Series supercomputer inherits CMOS technology and shared memory architecture as a successor product to NEC's SX-5 Series. It delivers improved price performance and saves space by implementing one CPU per chip while a CPU of its predecessor required several LSI chips.
Evolutionary path of NEC's SX Series
According to the NEC road map (see Primeur 3 July 2001), the NEC SX-5, is an evolutionary product, using a scaleable parallel architecture, with a huge shared memory space, and because of its balanced CPU and memory bandwidth throughput, it delivers efficient parallel processing. To give some detail, an SX-5 CPU has 16 vector pipes, uses 0.25micron CMOS technology and rated at 10 Gflop/s when using a 3.2 ns clock. The size of this CPU module is 225X225 mm and needs 32 LSI chips to produce.
The successor SX-6 Series, uses 0.15micron CMOS, incorporates many technology innovations and the whole CPU now fits on a chip. This has tremendous implications on cost reductions and price/performance. This is no mean engineering achievement, since issues like heat extraction and pin numbers needed for out of chip
communication, which affect memory bandwidth, cannot be wished away.
Main Hardware Features of SX-6
The main hardware features of the SX-6 Series are as follows: Configured with up to 128 nodes, the maximum possible, the SX-6 Series delivers 8Teraflop/s of peak performance and has up to 8Terabytes of memory capacity. This translates to 1.6 times higher speed of operations and twice as large memory capacity as that on the SX-5. (The only machine ever built with higher peak performance is the IBM ASCI White, at Lawrence Livermore Laboratories). An SX-6 node consists of 2 to 8 CPUs and has a peak performance of 64 Gigaflop/s and 64 Gigabytes of maximum memory capacity.
Floor space and power consumption of the SX-6 have been reduced by 80 percent compared to the SX-5. The low power consumption allows all models to be air-cooled. NEC claims that: "these two elements contribute to a great reduction of installation costs and complexities. The usage of highly integrated 0.15 micron CMOS technology has led to a greatly reduced number of components and this in turn led to tremendously improved hardware reliability".
Software and Middleware available
The SX-6 also provides various types of middleware and scientific computing application software enabling personal computers, UNIX server workstation(s) and the SX-6 Series connected via a network to be utilised as a single system image. Tools and libraries for developing parallel multi-node jobs include MPI, the Total View
debugger and Vampir/SX performance analysis tool.
Extensions to operating system and middleware for support of the enhanced multi-node system were made. The basic software "SUPER-UX" provides an improved SSI (Single System Image), while supporting upward compatibility with the SX-5 Series. MasterScope functionality for total system management of up to 128
nodes and job scheduling in a multi-node system have also been implemented. In addition to C++, Fortran 90 it also provides HPF V2, and OpenMP for data parallel processing on a distributed memory architecture.
The SX-6 Series offers Web Supercomputing Environment (WSE) as one of the middleware. This middleware enables the utilisation of supercomputers, UNIX server workstations and personal computers, when connected via Internet or Intranet, to operate as a single computation machine. In this system, users can boot application
software that are distributed among multiple computation machines, manipulate files and execute various commands using simple GUI operations, such as, those provided by Windows.
Historical background
In 1983, NEC entered the market of supercomputers, launching "the SX-2" whose performance exceeded the world standard of 1Gigaflop/s providing a system to meet the needs of ultra high-speed scientific computation. Since then, NEC has continuously provided supercomputers with state-of-the-art technology such as SX-3, SX-4 Series and SX-5 Series, and has received over 300 orders across the SX series because their high sustained performance and excellent price performance is highly valued.
NEC plans to start shipment of the SX-6 at the end of December and expects to receive 300 orders (domestic and international) in the next three years.
The two competing Architectures
In recent years, the performance of CPUs configured in scalar machines has improved, and very high peak performance has been claimed by vendors. The question is, how do these device speeds translate when they are incorporated into supercomputer systems? High end supercomputing is more than a chip: it also involves memory bandwidth, heat extraction and tight communication system integration. Thus, large-scale memory and high data transfer rates between CPUs are very important, for obtaining high sustained performance in large-scale computations.
The June 2001 Top500 list is crowded with IBM systems (ASCI derivatives), but when the ratio of R(max.) to R(peak) and Hockney's N(half) parameter are examined, they confirm that in scalar systems sustained performance reaches a plateau at a relatively low baseline, which is much lower than that achieved on parallel vector systems such as the SX series. When suppliers talk about peak performance, buyer beware
New computing challenges
Supercomputers have been utilised for various fields, such as, development of drugs, energy development of nuclear fusion and atomic energy, simulation of aircraft and space development, resource exploration of petroleum, automobile and machinery design, and construction and civil engineering design. They are also needed for the great challenges human beings are facing in this century, such as the development of
new drugs post genome and the development of high function materials, as well as other material needed in the development of nano-technology.
There have been increasing demands for a faster supercomputer with larger memory capacity, easier operation, and powerful enough to solve global environmental issues. As the results of researchers are urgently needed and because problems to be solved have become larger and more complicated, supercomputers that exceed current
performance and capacity are needed for these great challenges.
Meeting user needs
To meet these needs, NEC has developed the SX-6, successor to the SX-5 Series, with largely improved performance and memory capacity, and prices and energy consumption comparable to scalar computation machines. As Mr. Watanabe, Vice President of NEC Solutions said: " NEC, will continue to support current SX
vector architecture for high end computing. The vector parallel architecture provides very efficient processing capability, particularly for many technical applications, such as meteorology, computational physics and chemistry, and crash analysis. In the new SX Series we have implemented a vector processor on a chip using 0.15 micron CMOS technology, which gives us very significant cost reductions for our high end computing. I expect to reduce this to 0.1 micron by year 2005. In 10 years time this should go down to 0.05 micron but one has to remember that etching is only part of the problem for producing viable chips". (See Primeur, 20 June 2000).
At SC2001, NEC stated that the SX-6 architecture has a future, with two more successor products by year 2004-5. Incidentally with a mid-range system of 48 to 64 nodes the SX-6 should be routinely capable of delivering Teraflop/s sustained performance for large scale applications. With Cray selling NEC SX-6 systems under license, it should be the answer to the prayers of many large scale application users in the USA.
Chris Lazou
[News on Advanced IT][Calendar][Analysis][IT in Medicine]
|