|
The external factors that led up to the Columbia installation are threefold. First, there was the success of Japan's Earth Simulator; second, a revitalization from the Task Force for high-end computing emerged; and third Intel and SGI were highly motivated to demonstrate the effectiveness of their technologies.
The internal factors were fourfold and involved the impact of high-end computing (HEC) on the Columbia Accident Investigation Board (CAIB). Then, there was a requirement for HEC in support of the Shuttle Return to Flight programme which needed continuous computing. NASA also witnessed the success of the world's first Altix 3700 "Kalpana”, as well as the inability of its computers to serve engineering and science mission needs.
Dr. Brooks went on to say that co-operation was required at a national level, at NASA, in vendor technology, and for the NAS programme. Therefore, success through partnerships is being strived at to be an integrated modelling and simulation environment which enables NASA and its industry and academic partners to accelerate design cycle time, conduct extensive parameter studies of multiple mission scenarios, and increase safety during the entire life cycle of exploration missions, while satisfying the tight time constraints of fast-paced NASA exploration system design and acquisition cycles.
Leadership systems in the USA, cited by Dr. Brooks, are Red Storm, the Blue Gene architecture, and the SGI architecture. There are also a lot of proposals to people to start using these systems for mission critical applications such as aeronautics research, space operations, and exploration systems.
Integrated support for high-performance modelling and simulation depends on the NASA scientists and engineers. The NAS software experts exploit tools to parallelize and optimize codes, thus dramatically increasing simulation performance while decreasing turnaround time. Therefore, supercomputers, storage and networks are needed. Ultimately, NAS experts apply advanced data analysis and visualization techniques to help scientists explore and understand large data sets.
Columbia is one of the world's fastest supercomouters providing 61 TFlops. Dr. Brooks told the audience that it is conceived, designed, built, and deployed in just 120 days. Columbia is a 20-node supercomputer built on proven 512-processor nodes with over 10,000 Intel Itanium 2 processors. The SGI cluster provides the largest node size
incorporating commodity parts (512) and the largest shared-memory environment (2048). Its 88% efficiency tops the scalar systems on the Top500 list. The system shows a high reliability and immediate availability. Its capability is excellent with a capacity of 10,240 processors.
Benchmarks are important for comparison and diagnostics, explained Dr. Brooks. As a result, Columbia underwent several benchmark tests including the Linpack results; the Stream benchmark which measures sustainable memory bandwidth; and the NAS parallel benchmarks for mimic computation and data movement in computational fluid dynamics applications.
Dr. Brooks extended on single zone and multi-zone versions and programming paradigms. There are four benchmarks from NBP-MPI and NPB-OMP. In Columbia there are three different Altix nodes with a performance impact depending on bandwidth. The speaker mentioned the hybrid MPI+OpenMP versions of NPB with multi-zone versions of NPB, hybrid parallelization, single 512p node performance, and multiple-node performance.
The CART3D OpenMP/MPI scaling uses synthetic stress tests to characterize system performance by looking at the performance building blocks. This is used to help explain realized performance on applications.
Expanding on the Space Shuttle programme Return to Flight, Dr. Brooks explained that the objectives are to produce CFD solutions to study airloads and flowfields for debris transport; to support and improve Debris Analysis Tools; to develop a new 6 degrees-of-freedom (DOF) unsteady CFD debris tool; to build debris aerodynamics models; to develop a turnkey system for Cart3D unsteady 6-DOF debris and Space Shuttle Launch
Vehicle (SLV) calculations.
To this end, NASA has to supply an infrastructure to run hundreds of Ascent CFD cases with Overflow flow solver and to assist in running cases. This involves an automation of pre- and post-processing in which all NAS computers are used. Support is provided by the Debris Transport analysis tool to compute a ballistic trajectory of debris through the steady-state CFD ascent flowfield; to generate a collision detection code; and to achieve a
post-processing of the results. Furthermore, the engineers have to develop aerodynamic models of debris pieces
using the Cart3D Cartesian unstructured flow solver.
Up til now these are the accomplishments in the programme. The NAS systems have computed over 100 Overflow steady-state cases. There are significant improvements to the debris-trajectory computations. Engineers have computed hundreds of unsteady 6-DOF debris flights and they now are planning experimental work for validation, as Dr. Brooks explained.
In the Return to Flight programme, the objectives were to demonstrate a new rapid aerothermal CFD analysis capability. Dr. Brooks stated that the new capability shall permit near realtime analysis of observed Orbiter damage during flight. As such, the capability would provide an alternate high-fidelity evaluation of local heating
bump factors calculated from engineering codes. The required capability includes 10 damage sites at 10 trajectory points.
In this programme eight 512 processor nodes of the Columbia supercomputer were used for 24 hours. Twelve
different computational meshes were generated for 10 different damage/repair sites. Ten trajectory points were calculated for each damage/repair site. More than 100 high-fidelity Navier-Stokes calculations were performed.
Some unsteadiness was observed in some tile cavity CFD solutions and some pre- and post processing improvements were needed but on the whole the rapid aerothermal analysis demonstration was a success.
Dr. Brooks ended by telling the audience that the NASA Columbia system provides a new level of computing capability and capacity that enable mission impacts. The Columbia system represents a unique partnership between government and industry including SGI, Intel, and Voltaire to build one of the world's largest supercomputers. The Columbia system, interconnected via the high-speed InfiniBand fabric and running the Linux operating system, overtook the Japanese Earth Simulator in terms of sustained benchmark performance (51.9 Teraflops) and helped put the United States back on the high-end computing technology leadership track.
Furthermore, Columbia increased NASA's science and engineering modelling and simulation capability by a factor of 10. The system was about one-tenth the cost of the Japanese Earth Simulator, required significantly less infrastructure to house, and was installed and operational nearly 10 times faster. The success of Columbia has drawn the attention of the other United States government entities, academia, industry, and the international community. |