In June of last year, the international consortium of the Human Genome Project has completed a map of the human genome sequence. This important milestone is however just the first step in identifying the genes hidden in the sequence and understanding their function, a process known as annotation. Through a novel combination of data resources, and with the computational power provided by OSC, researchers at OSU have identified thousands of genes and obtained clues to their function. LabBook's genomic discovery system displays this data in an extremely intuitive and interactive environment, enabling the researcher to extract meaning from the sequence.
The annotation project at OSU was directed by Dr. Bo Yuan, Head of Bio-informatics in the Human Genetics Cancer Programme. "We have combined accurate, non-fragmented and non-redundant whole genome mapping of expressed genes with comprehensive annotation", stated Dr. Yuan. "Now we can truly perform hypothesis driven queries of the human genome, which was not previously possible."
"Annotation is what makes the genome useful", agreed Dr. Fred Wright of Human Cancer Genetics. "We have drawn on several unique resources to discover the genes and how they work. The OSC contribution was critical, as this great job would have literally taken years without their computing power while LabBook's discovery system enables researchers to visualise this information in a beautifully integrated environment." Dr. Wright expects that the LabBook software will accelerate efforts underway at OSU to identify genes involved in numerous diseases.
The process of annotation is extremely computationally-intensive, involving millions of automated searches and comparisons along the 3-billion letter code of the human DNA sequence. The necessary hardware is beyond the capabilities of most university or industry laboratories. The supercomputing cluster machines at OSC proved ideal for the job. "The generation of this database required a tremendous amount of computer power", said Al Stutz of OSC. "We were fortunate to have the Silicon Graphics 1400 cluster which provided us with this capability."
While genomic annotation provides a solid basis for genetic research, its usefulness remains limited without powerful software to perform queries and visualise the results. The human genome project has been accompanied by an explosion of data on human genes, along with modern high-throughput technologies for analysing their functions. This expanding collection of information for drug discovery holds the key to treatments for a broad range of human diseases, but the data sources are dispersed and of limited use to biologists who are not trained in bio-informatics.
Utilising the data effectively requires integrating these disparate data types in a unified Web environment. To solve this problem, OSU, LabBook, and OSC are providing the OSU Human Genome Database within LabBook's genomic discovery space. "The OSU Human Genome Database and the way we deliver it to the scientist represent an important advance in extracting meaningful information from human genome data", stated Dr. Shawn Green, CEO of LabBook. "This meticulously assembled, comprehensively analysed, extensively annotated, and carefully integrated research platform enables utilisation of human genomic data in ways that have not been possible until now."
LabBook's genomic discovery space derives its power from the synthesis of four complementary information technologies: an exhaustively analysed, and comprehensively annotated human gene database; a uniquely effective and discovery-based query engine; a dynamic XML-based data visualisation interface through LabBook's Genomic XML Browser; and a functionally integrated Web-based information management system, the eLabBook. The browser maximises the value of bio-informatics information by delivering it as "live", reusable documents in a highly visual and interactive discovery environment.
The combination of an open XML standard for bio-informatics with a biology-smart browser creates the ideal bio-informatics "front end" which enables dynamic integration and annotation of diverse data. "To fully exploit the data derived from the sequencing of the human genome for the advancement of drug discovery, the data has to be accessible to virtually all biologists and not just the bio-informatics specialist", stated Dr. Adel Mikhail, LabBook's Vice President of Corporate Development. "Our approach is to integrate and simplify bio-informatic information so that all interested biologists can effectively utilise this information."
LabBook Inc. is an XML-powered life science informatics and information provider for the biotechnology, pharmaceutical, and academic life science researcher. LabBook's enabling software, such as the LabBook Genomic XML Browser, queries, manages, and visualises heterogeneous genomic data types while retaining their underlying associations, and then intelligently communicates targeted information to the researcher. LabBook's mission is to solve the life science industry's need for rapid access to targeted data and new discoveries. LabBook Inc. offices are located in McLean, Virginia and Columbus, Ohio.
A second gene identification project has been launched by the United States Department of Energy. For the detailed story, we kindly refer you to this month's VMW article Sandia, Celera and Compaq aim at petacruncher level in proteomics research.