In a study which appears in the on-line edition of Science, the researchers focused on examining the non-coding regions of the genome for areas that are likely to play a key role in human biological function by combining chemical and computer analyses. To do this, the researchers developed a method which incorporates information about the structure of DNA to compare sequences of genomes from humans and 36 mammalian species that included the mouse, chimpanzee, elephant and rabbit.
By examining the shapes, grooves, turns and bumps of the DNA that comprises the human genome, the team discovered that 12 percent of the human genome appears to be constrained by evolution. That's double the six percent detected by simply comparing the linear order of DNA nucleotides - A, T, G, and C, the familiar letters that make up the genome. The huge increase stems from finding some DNA sequences that differ in the order of nucleotides, but have very similar topographical shapes, and so may perform similar functions.
They went on to show that the topographically-informed constrained regions correlate with functional non-coding elements better than constrained regions identified by nucleotide sequence alone. Structural features that have been preserved across many species are likely to play important roles in how the human body functions, while those that have changed over the course of evolution may play a less central role or no role at all.
"By considering the three-dimensional structure of DNA, you can better explain the biology of the genome", stated Thomas D. Tullius, Boston University professor of chemistry who has spent more than 20 years developing ways to map the structure of the human genome. "For this achievement Stephen Parker, a Boston University graduate student, deserves much of the credit for his development of the algorithm that incorporated DNA structure into evolutionary analysis."
Bringing a molecular biologist's point of view and expertise in comparing the genomes of different species was Elliott Margulies, an investigator at the National Human Genome Research Institute's (NHGRI) Genomic Technology Branch. "Proteins that influence biological function by binding to DNA recognize more than just the sequence of bases", he stated. "These binding proteins also see the surface of the DNA molecule and are looking for a shape that allows a lock-and-key fit."
In their Science paper the researchers also explored how small genetic changes, or variations, known as Single Nucleotide Polymorphisms (SNPs) could prompt structural changes that might lead to disease. In studying these mutations from a database of 734 non-coding SNPs associated with diseases, such as cystic fibrosis, Alzheimer's disease, and heart disease, they found that disease-associated SNPs produced larger changes in the shape of DNA than SNPs not associated with a disease.
The many topographical levels of a chromosome. Genomic DNA is shown streaming out from a chromosome (left), progressively unfolding as chromatin, the 30-nm filament, nucleosomes, the DNA double helix, and finally the letters representing the nucleotide sequence. Although it is the molecular topography of the DNA helix that is recognized by proteins, current methods of genome analysis mostly focus on the order of nucleotides. Photo: Courtesy of NHGRI.
"This new approach is an exciting advance that will speed our efforts to identify functional elements in the genome, which is one of the major challenges facing genomic researchers today", stated NHGRI Scientific Director Eric Green, M.D., Ph.D. "Coupled with continued innovations in DNA sequencing, this topography-informed approach will expand our ongoing efforts to use genomic information to improve human health."
The sequence of the 3 billion DNA base pairs that make up the human genome holds the answers to many questions pertaining to human development, health and disease. Consequently, much research aimed at understanding the genome has focused on establishing the information encoded by the linear order of DNA bases. In the new study, however, researchers focused on how those bases chemically interact with each other to coil and fold the DNA molecule into a variety of shapes.
In 2003, an international team of researchers finished a reference sequence of the human genome, an achievement that greatly sped efforts to find genes, which reflect the approximately 2 percent of the genome that codes for proteins. At one time, the remaining 98 percent of the genome was referred to as junk DNA. Researchers now know that this non-coding DNA contains elements that carry out important biological functions, such as turning genes off or on. However, little information exists about where these non-coding functional elements are located and how they work.
The new research findings on evolutionary conservation of DNA structure stem from recent progress in analysing the functional elements in a representative fraction of the human genome. That study, known as ENCyclopedia of DNA Elements (ENCODE), organized by the National Human Genome Research Institute, challenged the traditional view of the human genetic blueprint as a collection of independent genes. Instead, researchers found a complex network of genes, regulatory elements, and other DNA sequences that do not code for proteins.
The study determined, for the first time, where many types of functional elements are located, how they are organized, and how the genome is pervasively made into RNA. The current research on genome structure and function is based on some of the ENCODE findings, noted Thomas D. Tullius, whose work in developing the new technology was funded through the ENCODE project.
In addition to Thomas D. Tullius and Elliott Margulies, the other authors of the Science paper, "Local DNA Topography Correlates with Functional Noncoding Regions of the Human Genome", are Stephen C.J. Parker and Loren Hansen, both Boston University graduate students in bioinformatics, and Hatice Ozel Abaan, a technician in Elliott Margulies' laboratory.
The NHGRI is one of the 27 institutes and centres at the NIH. The NHGRI Division of Intramural Research develops and implements technology to understand, diagnose and treat genomic and genetic diseases.
Founded in 1839, Boston University is an internationally recognized institution of higher education and research. With more than 30.000 students, it is the fourth largest independent university in the United States. Boston University consists of 17 colleges and schools along with a number of multi-disciplinary centres and institutes which are central to the school's research and teaching mission.