Sanger Centre's scientists speed the search for the secret of life with decoding of human chromosome

Cambridge 03 December 1999An international team of scientists has passed a milestone by deciphering, for the first time ever, the complete genetic code of a human chromosome and revealing the existence of hundreds of genes previously unknown in humans. Researchers from the Wellcome Trust-funded Sanger Centre at Hinxton Hall, Keio University in Japan and US laboratories at the University of Oklahoma and Washington University, St. Louis, succeeded in writing down the 34 million "letters" which make up the entire sequence containing all the protein coding genes of Chromosome 22. This is the first human chromosome sequence to be completed. It has revealed 679 genes. Improvements in software for finding genes will allow to find an even greater number. The work gives scientists a real insight into the way genes are arranged along a strand of DNA and how they can be controlled, paving the way for huge advances in medical diagnosis and treatment.


The researchers who recently broke the code of chromosome 22 have invested $3.2 million in additional Compaq Alpha technology in order to prepare for the next major discovery. In fact, the Sanger Centre stated that without the existing clustered network of 250 Alpha systems running Compaq software, this triumph would never have been possible. With the exponential growth of modern computing power, the Sanger Centre and its international collaborators have currently decoded one-third of the 3 billion base pairs of DNA and will produce the final finished sequence by the year 2003.

"Without Compaq Computer's Alpha technology and Tru64 UNIX software, the efforts to analyse and decode chromosome 22 would not have been successful. Today, the information and biological revolutions are being merged in the life sciences business. We are stretching the boundaries of computational science", said Richard Durbin, Assistant Director at the Sanger Centre.

From the start, the Sanger Centre architectural concept embraced Compaq workstations, server-based compute farms, and system-independent storage. We required a scalable system that could run continually and handle faults without crashing or interrupting our calculations, according to Phil Butcher who is Head of Information Technology at the Sanger Centre. "We wanted to be able to create loosely coupled, logical clusters to distribute the application load."

To handle the burgeoning quantities of data, the Sanger Centre has built a storage architecture that is hierarchical and system-independent. "We constructed one of the first storage area networks (SANs)", as stated by Mr. Butcher. "We wanted to make sure that all storage and computing elements were separate on the network in order to be able to scale them both independently. Network storage enables Sanger to add modular elements as needed."

Compaq StorageWorks RAID systems make up the bulk of 6 terabytes of the disk storage, along with 300 Gigabyte Network Appliances RAID subsystems. "Unlike the smaller RAID system which is bounded and has only single components, the Compaq StorageWorks systems are expandable and consist of dual heads, controllers and power supplies. In this way, there is no single point of failure", Mr. Butcher added.

Compaq's technology is sufficiently scalable and also flexible to be constantly upgraded to meet Sanger Centre's increasing need for additional computing power. For instance, the mapping as well as sequencing of Sanger Centre's part of the Human Genome Project, which comprises one third of the entire map, currently utilises some 3.5 terabytes of storage annually. It should be kept in mind that the chromosome 22 constitutes one of the smallest human chromosomes so the storage required to generate the entire map will be an order of magnitude greater.

Previous research has already revealed that chromosome 22 is implicated in the workings of the immune system, congenital heart disease, schizophrenia, mental retardation and several cancers including leukaemia. The availability of the DNA sequence will revolutionise the future of research on this type of diseases, and the function of other genes on Chromosome 22. The following ambitious task is to decode the remaining 2 billion base pairs of DNA which comprise the rest of the human genome as well as to map the other twenty-two human chromosomes. If you know the context of the genes, you can find out if there are particular regions of chromosomes associated with a disease. It has also become possible now to compare the whole genomes of different organisms.

Gerard van de Aast, Vice President of the Enterprise Solutions & Services Group (ESSG) at Compaq, commented that bioinformatics has made leaps forward because of advances in computing as much as by excellence in the scientific techniques. Compaq's Alpha technology is scalable, powerful and reliable, and it is the natural choice of scientists and corporations which have a need for competitive cutting-edge technology. Mr. van de Aast states that Compaq is proud to have played such a vital role in what has been hailed as the discovery of the century. And the company is even prouder to be selected as the platform for the analysis and decoding of the next set of data. The new purchase involves Compaq AlphaServer ES40 and DS1O systems.

Leslie Versweyveld

[Medical IT News][Calendar][Virtual Medical Worlds Community][News on Advanced IT]