Database of gene sequences expressed in the human genome goes public

Sante Fe 12 March 1998 From now on, scientists who are trying to establish a unified view of all the genes to be discovered in the human genome and their possible relationship to disease, are able to benefit from the public use of an exciting and useful tool, called STACK. The National Centre for Genome Resources (NCGR) and the South African National Bioinformatics Institute (SANBI), based at the University of the Western Cape, near Cape Town, have been closely working together to launch the Sequence Tag Alignment and Consensus Knowledgebase (STACK), a unique DNA sequence database. STACK is being made publicly available through NCGR's Genome Sequence DataBase (GSDB), which is hosted in Santa Fe. Pangea Systems Company is offering a commercial version of the data.

Advertisement

From now on, scientists who are trying to establish a unified view of all the genes to be discovered in the human genome and their possible relationship to disease, are able to benefit from the public use of an exciting and useful tool, called STACK. The National Centre for Genome Resources (NCGR) and the South African National Bioinformatics Institute (SANBI), based at the University of the Western Cape, near Cape Town, have been closely working together to launch the Sequence Tag Alignment and Consensus Knowledgebase (STACK), a unique DNA sequence database. STACK is being made publicly available through NCGR's Genome Sequence DataBase (GSDB), which is hosted in Santa Fe. Pangea Systems Company is offering a commercial version of the data.

SANBI-director Winston Hide and researcher Robert Miller have designed an innovative method to process a database of publicly available human Expressed Sequence Tags (ESTs). Their colleagues have developed portable tools to use with a system produced at SANBI, and running on a powerful Silicon Graphics Origin2000 multiprocessor server. The system generates alignments and consensi from the individual sequences to cluster them, which results in a STACK database. This independent information resource for the analysis of disease gene candidates can be easily implemented to solve questions about gene expression, gene hunting and polymorphisms, according to Hide.

Since only a tiny fraction of the more than 50.000 human genes has been fully sequenced up till now, STACK can pay large services in the processing of gene fragments, in their error detection and, of course also in the creation of carefully joined sets of consensus sequences for each gene sequence. The expressed gene sequences are organised according to tissue and each gene gets represented with alignments of its expressed fragments. The algorithms which are applied to create the database, include efficient error compensation methods in order to generate longer and more accurate consensus sequences.

Database and biology experts of NCGR, a non-profit genetic services organisation, have assisted the SANBI-scientists in making STACK publicly available via their Genome Sequence DataBase in Santa Fe. They have equally provided custom computer tools to access, view and analyse the STACK data in order to enable researchers to better compare newly discovered sequences. GSDB Manager Carol Harger states that STACK has the potential to differentiate between various members of the same gene family or between alternative products of one single gene.

As international as the project might be, STACK essentially is based on African technology for research into the nature of genes that originally came out of Africa in the first place, as has been proudly observed by Winston Hide. For a look at the gene sequences, please consult the SANBI Web site. The Genome Sequence DabaBase in turn is offering specialised access to STACK with advanced query capabilities.


Leslie Versweyveld

[Medical IT News][Calendar][Virtual Medical Worlds Community][News on Advanced IT]