There are a number of important classes of applications where the data sets regularly exceed the ability of acceleration techniques to provide the memory response necessary to maintain processing speed. Examples of memory-intensive applications include those dominated by the processing of sparse matrix systems as well as many applications used in industrial environments.
Sparse systems typically do not fit in cache memories, and therefore are difficult to solve at high speeds because the cache becomes ineffective and processing therefore approximates the performance of main memory more closely than the performance of the CPU. The solutions speed is simply limited by the ability of memory to deliver operands. The STREAM benchmark registers this performance for a numbe of operations.
High memory bandwidth has traditionally been the province of shared memory parallel vector systems (PVP). PVP systems essentially replace cache with a high performance main memory. The result is that extremely large numeric systems, whether dense or sparse, regular or irregular, can be solved at very high computational rates. Further, PVP CPUs are matched to the performance of main memory and provide significantly higher sustained performance on these difficult classes of applications as compared to other architectures that are fully dependent on memory acceleration techniques.
The STREAM results for the NEC SX-5/16A reported by NEC are shown below. In keeping with the metrics used by STREAM, numbers reported are in MB ( 10^6) per second of sustainable memory bandwidth while processing specific kernels.
Computational Operation
CPUs Copy Scale Add Triad
SX-5/16A
16 607492 590390 607412 583069
8 332551 332551 371160 366690
4 168486 168509 189555 189517
2 84853 84853 95352 95328
1 42545 42546 47780 47779
The other "top10" machines in this benchmark category have the following performance.:
CPUs Copy Scale Add Triad
Cray_T932_321024-3E 32 310721.0 302182.0 359841.0 359270.0
Cray_C90 16 105497.0 104656.0 101736.0 103812.0
Cray_Y-MP 8 19291.6 19294.2 26588.9 26802.2
SGI_Origin_2000-300 128 23846.0 23437.0 26365.0 26729.0
SGI_Origin_2000_195 128 21857.6 23351.7 24459.5 22913.6
NEC_SX3-44 1 16941.0 15640.7 22436.5 21972.2
Cray_J932 32 19007.0 18944.1 19993.9 18870.4
Cray_T94 1 11341.0 10717.0 14783.0 13920.0
Check in at the STREAM web site for more information: www.cs.virginia.edu/stream