TGen selected the SGI Altix 4700 system with over half a terabyte of shared memory so researchers in the Phoenix, Arizona, institute can search across multiple chromosomes, all in memory, without having to break the problems into smaller pieces, enabling researchers to look at the whole instead of the sum of the parts.
While custom in-house code will be written for these large data searches, TGen reports that benchmarks were run on the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST), an algorithm for comparing primary biological sequence information, ClustalW, a general purpose multiple sequence alignment programme for DNA or proteins, and NAMD, a parallel molecular dynamics code for large biomolecular systems. Testing resulted in performance improvements of up to 50 percent on the 64-bit SGI Altix system as compared to their existing 32-bit system architecture.
"Technology, and micro-arrays specifically, have allowed the size of bioinformatics data sets to become so large that it became vital to acquire a large memory, 64-bit computational infrastructure to be able to manipulate and analyse those files at the level our researchers required", stated James Lowey, Director of High Performance Bio-Computing at the Translational Genomics Research Institute. "Conventional 32-bit computer architectures cannot address memory above 4GB. This limitation poses sub-optimal analytical approaches due to the prohibitively protracted computer analysis time needed for optimal mathematical models and computational algorithms."
The molecular profile datasets being analysed at TGen cover malignant myeloma, melanoma, Alzheimer's, autism and pancreatic, prostate, colon, and breast cancers. With micro-arrays, each individual file can be up to 150 MB. To search and compare genetic patterns, researchers take hundreds and hundreds of copies of that individual file. TGen's code will harness the compute power of the SGI Altix system's global shared memory to run micro-arrays, searching for variations as minute as a change on one protein, trying to determine what effect that has across the entire spectrum of what is being observed.
"The success of TGen scientists to date has come at the sacrifice of time", stated Dr. Edward Suh, CIO of TGen. "However, individuals affected with the disease do not have the luxury of time. The 64-bit SGI computing system will optimize TGen researchers' ability to meet their data analysis needs efficiently, hopefully leading to timely and effective discovery for improved human health."
The system will be housed at Arizona State University (ASU) in Tempe, Arizona, with operational support provided by ASU's Fulton High Performance Computing Initiative (HPCI). "The shared memory capabilities of the new SGI system provide a welcome addition to the HPC portfolio in Arizona, and will enable researchers at TGen and their collaborators at ASU to address problems we've been unable to tackle in the past", stated Dan Stanzione, the Director of the Fulton HPCI.
Purchased through James River Technical Inc., SGI's designated partner for higher education and research, and installed in late May 2007, the SGI Altix 4700 system at TGen is equipped with 576 GB memory and 48 Intel Itanium 2 cores.
"We are very excited about the award of the NIH grant to TGen, and the subsequent installation of the SGI Altix system", stated Tom Mountcastle, President of James River Technical (JRT). "Genomics is a strategic focus for JRT and SGI and the management and processing of large data sets and work flow are ideal fits for the Altix 4700. We are pleased that the system has met the expectations of TGen and we are proud to be a part of the support infrastructure to facilitate this very important research that affects all of our lives."
"TGen's use of SGI technology is another example of SGI's ability to deliver solutions for demanding compute and data-intensive bioscience work flows", stated Deepak Thakkar, biosciences segment manager for SGI. "The Altix system's scalability, flexibility and reliability, coupled with its interoperability, provide the best combination of compute, memory and I/O elements, matching the diverse needs of the TGen lab environment."
More SGI bioinformatics news can be found in this VMW issue's article SGI introduces the SGI BioCluster life sciences work flow solution.