IBM supplies Indiana University with new supercomputer

Bloomington 17 October 2001 An IBM supercomputer located in Indiana University (IU) has been expanded to triple the university's previous computing capacity and will support IU researchers in a broad range of areas, including life sciences, archaeology, astronomy, and computational physics. It will also serve as the backbone for a planned genomics research collaboration with IBM.

Capable of performing one trillion numerical calculations per second, the teraflop system is the university's largest high-performance computing acquisition ever. The supercomputer is part of the information technology (IT) infrastructure needed to support the Indiana Genomics (INGEN) initiative. This initiative was funded by a major grant from the Lilly Endowment.

"This agreement builds on a critical mass of intellectual capital already established through IU's outstanding faculty and our expertise in IT-based research", said IU President Myles Brand. "Our faculty and staff will participate in developing state-of-the-art IT tools and applications for life sciences research, including genomics, which will help us discover new ways of preventing and treating human disease."

The Indiana University Teraflop SP will be tightly connected to IU's massive data storage system that is capable of holding hundreds of trillions of bytes of data. But this processing capability is meaningful only when harnessed to do important work. The Teraflop SP will enhance existing research programmes and make possible new research in many disciplines, including the medical sciences. As the human genome becomes increasingly well understood, one of the critical issues will be identifying what are called Single Nucleotide Polymorphisms (SNPs) - places in the genome where different people display different genetic information. Here scientists expect to find some of the keys to understanding human genetic diseases. The IU Teraflop SP will be used to search for SNPs and to analyse clinical data en route to unraveling the relationships between genes and cancer.

"Indiana University's teraflop system lays the groundwork for IU to become a leading institution for genomic research", said Irving Wladawsky-Berger, IBM vice president, Technology & Strategy, IBM Server Group. "Our expanded collaboration will open doors to new discovery and enable both organisations to draw on complementary strengths, including IBM's extensive research expertise in computational biology and advanced IT solutions."

IU is uniquely positioned to advance life sciences research through INGEN, a collaboration of scientists and physicians who will study the information that makes up the human genome and its function in human health. INGEN combines the strength of the IU School of Medicine, research programmes in biology and chemistry, and IU's leadership in high performance computing. IBM is the primary provider of supercomputing technology for INGEN.

"The sequencing of the human genome launched a new era of research in the life sciences", said Michael McRobbie, IU vice president for information technology and chief information officer. "The unprecedented amounts of genomic data require advanced computational resources. The teraflop supercomputer is a key first component of INGEN's IT infrastructure and will provide a major boost to scientific progress at IU in this area."

The expanded IBM SP supercomputer will provide the computational and data management power required to make advances in many important areas of genomic science. Biomedical and biological sciences present a tremendous wealth of data. Supercomputers are required to analyse these massive data stores and to create the linkages among different types of data, for example, clinical records and genetic information, that will enable new breakthroughs in health care.

"The transformation of life sciences research has brought numerous challenges to the scientific community", said George Strawn, acting assistant director for computer and information science and engineering at the National Science Foundation. "NSF recognises that the data- and compute-intensive nature of the research requires instruments that US scientists working nationally and internationally can share. IU, in collaboration with IBM, is working to meet these challenges and to make advanced computational resources remotely accessible to the broader research community."

The Teraflop SP will enable types of research that are currently neither practical nor possible. It will be the first system at IU capable of holding in memory all of the information required to simulate clusters of hundreds of thousands of stars - a capability that will enable new understanding of the formation and evolution of our own Milky Way galaxy. Geologists will use the SP in their reach for understanding the finely detailed structure of the earth's crust. Analysing the evolutionary relationships of hundreds of organisms - work that might take five years on a personal computer - can be done in a matter of weeks on the Teraflop SP. In all instances, the SP does more than speed the pace of research. The Teraflop SP will open new doors to scientific discovery in dozens of fields. IU researchers and UITS staff will demonstrate a few of the research applications of IU's Teraflop SP, described below.

N-body gravitational simulation: Star cluster modelling

Astronomers would like to understand the way star clusters form and develop. The equations of motion have been known since Newton's time, but they cannot be solved analytically; so we turn to simulations. A globular cluster with a hundred thousand stars has ten billion gravitational interactions to compute at each time step. Calculations similar to this will be enabled by the Teraflop SP to model the complex dynamics within evolving start clusters. Such simulations provide a basis for understanding the formation and evolution of the extraordinary X-ray emitting binary systems that are now being found in large numbers in the cores of globular star clusters by the Earth-orbiting Chandra X-ray telescope. IU Astronomy Department researchers are working with colleagues at Harvard University to study the properties of these X-ray binaries, which contain highly collapsed white dwarf and neutron stars.

fastDNAml: Inferring evolutionary relationships

Many important questions in science and medicine involve answering the question: What are the evolutionary relationships among a group of organisms? It is now possible to infer the evolutionary relationships among organisms based on DNA sequences. However, this process takes tremendous amounts of computation. For example, analysing the evolutionary relationships of 100 animals on one microcomputer might take as long as five years. Indiana University is collaborating with researchers at other institutions to create a parallel (supercomputer) version of fastDNAml, a popular package for inferring evolutionary relationships. fastDNAml has been used at Indiana University to better understand the evolutionary origins of the Microsporidia, an economically and medically important group of parasites. Better understanding of evolutionary origins of these disease-causing organisms will shed light on better methods for treating the diseases that Microsporidia cause.

The IU massive data storage system: Making terabytes accessible from desktops

Delivered using the High Performance Storage System (HPSS) software, the massive data storage system (MDSS) is intended for users with projects that need large-scale, near-line storage. Since HPSS works best with large files, optimal applications will store data in files that are typically larger than 50MB, including large collections of high resolution areal maps, digitized art work, sound and animation files, astronomical images, and the like. IU is the only large academic site in the world that provides a diverse user base with ubiquitous (and easy) access to nearly 200 Terabytes (1 Terabyte = 1,000 Gigabytes = 1,000,000 Megabytes) of data storage capacity in the central, tape-based, massive data storage system. IU presents two demonstrations of this system: user access via the Web of the data stored in their MDSS area, and a video clip stored in the MDSS in a large file displayed in real time on a Windows desktop machine using the convenient Distributed File System (DFS) front-end.

XMView/CMView: Scalable molecular visualisation

The Indiana University Molecular Structure Center's (IUMSC) on-line database of molecular structures is a valuable resource for chemistry researchers and students from all over the world. Thanks to a collaboration with the UITS Advanced Visualization Laboratory, chemists now have a tool for studying these molecular structures - XMView/CMView, a scalable visualisation system for molecular chemistry. This application offers a pair of programmes, each with similar functionality and a common file format, providing researchers the convenience and flexibility of working at their desktops, as well as the power, visual complexity, and ease of interaction offered by working in the CAVE. Chemists can download data from the IUMSC's Web-accessible database, grow crystal structures interactively, perform precise measurements on those structures, and then move data files to an immersive, virtual environment for more detailed examination.

IBM digital displays and advanced imaging

Indiana University has acquired one of IBM's newest, advanced digital displays, the IBM T220, a 22.2-inch diagonal with 9.2-million pixels, yielding a resolution of 204 DPI (dots per inch). This resolution is so fine that some detail is visible only with a magnifying glass. Two types of images are demonstrated: satellite images and medical images. Some satellite image data are collected in such a fashion that they require considerable computing power to convert raw data to an actual image. The display shows several images of the Bloomington area, collected via satellite and converted into images on the IBM SP supercomputer. Also on display are biomedical images that demonstrate the utility these advanced displays offer to biomedical researchers and clinicians.

BioSifter: An intelligent biological information management system

Advances in biomedical research have led to tremendous growth in the amounts of data, in a variety of formats, that biomedical researchers must store and manipulate. This has created an even more critical need for innovative information management and knowledge discovery tools that can sift through these volumes of data. Intelligent software systems that can seamlessly integrate information resources and data analysis tools will enable biomedical researchers to integrate existing information in the various subtasks of their research activities. In this research we present a general model for an information management system that is adaptable and scalable, followed by a detailed design and implementation of one component of the model. The prototype, called BioSifter, was applied to problems in the area of bioinformatics. The results show BioSifter as a powerful tool with which biological researchers can automatically retrieve relevant text documents from biological literature based on their interest profile.


Ad Emmen

[News on Advanced IT]   [Calendar]   [Analysis]   [IT in Medicine]