Harnessing the web and supercomputers to track pathogens as they evolve

Columbus 12 April 2010A new web-based application powered by supercomputers has the potential to inform public health decisions by visualizing genetic and evolutionary information about the spread of infectious diseases across time, geography, host animals and humans. In a journal article published in the April 2010 on-line issue of Cladistics, Daniel Janies, Ph.D., explains how Supramap was created to track the avian influenza virus (H5N1) and, more recently, to monitor the H1N1 virus. Cladistics refers to the scientific classification of living organisms, based on common ancestry, into evolutionary trees. Evolutionary trees are used by many researchers studying infectious diseases to understand the geographic and host origins of pathogens and how the pathogens change over time. Supramap puts phylogenies in a geographic context as well.

Advertisement

Pathogens can now be easily tracked in time and space as they evolve, an advance that could revolutionize both public health and inform national security in the fight against infectious diseases. Developed by researchers that include scientists at the American Museum of Natural History, Supramap is a new, powerful, web-based application that maps genetic mutations like those among the different strains of avian influenza onto the globe.

"The integration of our core phylogenetic reconstruction codes with Supramap has allowed an entirely new way to view linked evolutionary and geographic information", stated Ward Wheeler, a co-author of the article and curator-in-charge of scientific computing at the American Museum of Natural History (AMNH). "The Supramap tool set has broad utility not only in tracking human disease in time and space, but also in historical patterns of biodiversity and global biotic changes."

Daniel Janies, an associate professor of Biomedical Informatics at the Ohio State University (OSU), Ward Wheeler and several colleagues created Supramap to calculate and project evolutionary trees in on-line geographic information systems, such as Google Earth. The resulting visualizations have been described as "weather maps for disease" that allow public health officials to see when and where pathogens spread, jump from animals to humans and evolve to resist drugs.

"Currently, we are investigating H1N1 cases from around the world - and Ohio - by building evolutionary trees that discover how this strain came to be assembled and jumped from animals to humans. We are also monitoring specific viral genes for mutations that confer resistance to drugs", stated Daniel Janies, an expert in computational genomics. "Using parallel programming on high performance computing systems at the Ohio Supercomputer Center (OSC) greatly improves the efficiency and accuracy of our work."

Daniel Janies and his colleagues used a small cluster computer at OSU to beta-test the Supramap application, which has been developed through a grant from the Defense Advanced Research Projects Agency (DARPA). The research team also adapted the Supramap code to function smoothly on the Ohio Supercomputer Center (OSC)'s flagship IBM Cluster 1350 "Glenn" system, which features 9500 cores and 24 terabytes of memory. They now are working with the Center's staff to finish development of a web interface to provide easy Internet access to the application by scientists and public health officials.

"OSC is at the forefront of emerging computational methods and their use by the next generation of health care researchers and providers - researchers like Dr. Janies", stated Ashok Krishnamurthy, interim co-executive director of OSC. "The use of high performance computing to help bring research results to community-accessible points-of-care will contribute significantly to the health and well being of people here and abroad."

Daniel Janies' vision for Supramap involves receiving a steady stream of genomic and geographic data on pathogens from sources around the planet, analyzing the raw data nightly and updating maps available to public health officials each morning. Policymakers, he believes, could then make better-informed decisions on crucial issues, such as determining global hotspots for the emergence of dangerous pathogens and identifying where and when antiviral drugs are useful or not.

"Supramap does more that put points on a map - it is tracking a pathogen's evolution", stated Daniel A. Janies, first author of the paper. "We package the tools in an easy-to-use web-based application so that you don't need a Ph.D. in evolutionary biology and computer science to understand the trajectory and transmission of a disease."

"This tool also has a lot of predictive power", stated lead author Ward Wheeler, curator in the Division of Invertebrate Zoology at the American Museum of Natural History. "If the movement of a pathogen is related to bird flyways, and those routes are shifting because of something like climate change, we can predict where the disease might logically emerge next."

In recent years, the collection of genomic sequences of the coronavirus that causes Severe Acute Respiratory syndrome (SARS) and various strains of the influenza A virus have become a vital part of fighting outbreaks of these infectious diseases. The initial jump of a pathogen into humans has become increasingly important to understand because of growing human-animal contact and global travel. Researchers now know, for example, that SARS has a deep evolutionary origin in bats. Another recent use of genetics, geography, and the phylogenetic trees that map the evolutionary relationships among different strains of pathogens is to predict hotspots of disease re-emergence.

Operating on parallel programming on high-performance computing systems at Ohio State University and the Ohio Supercomputer Center, Supramap advances the use of genetic information in studying infectious outbreaks a step further. This application integrates genetic sequences of pathogens with geographic information so that researchers can track the spread of a disease among different hosts and follow the emergence of key mutations across time and space. With Supramap, users can submit raw genetic sequences and obtain a phylogenetic tree of strains of pathogens. The resulting tree is then projected onto the globe by Supramap and can be viewed with Google Earth. Each branch in the evolutionary tree is geo-located and time-stamped. Pop-up windows and colour of branches show how pathogen strains mutate over space and time and infect new hosts.

Daniel Janies, Ward Wheeler, and colleagues tested Supramap's capability by entering genetic and geographic data on recent isolates of avian influenza (H5N1). The diversity of viral strains from birds and mammals in China, Russia, the Middle East, Africa, and Europe are represented as they spread westward over four years. The evolutionary tree, based on 239 sequences of a specific gene, polymerase basic 2, shows that host shifts are highly correlated with a specific mutation (in E627K) that allows avian viruses to adapt to mammalian hosts.

"There are many efforts by governments and non-governmental organisations to encourage sharing of raw genomic information, especially for pathogens", stated Daniel Janies. "But the raw genetic information still needs interpretation, and we are sharing our know-how and even our computers so that this can happen. We aim for our tools to inform decisions about potential global hotspots for the emergence of diseases from animals and areas of drug resistance."

"Biogeography and phylogeny, or the study of evolutionary and geographic relationships among organisms, are the core areas of research in the Museum", stated Ward Wheeler. "Our expertise is now being applied to a new, practical set of research questions, the spread of disease and human health. And this can expand to get a handle on other problems like the movement of invasive species."

Daniel Janies was formally trained as a biologist, but as a result of the computational demands of the biological questions he was investigating, he began developing hardware and software. Daniel Janies led the design, construction, of a computing cluster during a previous post with AMNH. He earned his doctorate in zoology at the University of Florida and his bachelor's degree in biology at the University of Michigan.

Ward Wheeler's research focuses on the systematic relationships among and within insects, crustaceans, and chelicerates, sequencing DNA and reconstructing evolutionary trees to study their evolution over 500 million years. To analyze those data, he has built some of the largest computing clusters in the world used for phylogenetic research. Ward Wheeler earned his doctorate in zoology at Harvard University and his bachelor's degree in biology at Yale College.

The authors of the article include:

  • Daniel Janies, The Ohio State University, Biomedical Informatics
  • Travis Treseder, The Ohio State University, Biomedical Informatics
  • Boyan Alexandrov, The Ohio State University, Biomedical Informatics
  • Farhat Habib, Indian Institute of Science Education and Research
  • Jennifer Chen, The Ohio State University, Biomedical Informatics
  • Renato Ferreira, Universidade Federal de Minas Gerais, Brazil, Departamento de Ciência da Computação
  • Ümit Catalyürek, The Ohio State University, Biomedical Informatics
  • Andrés Varón, American Museum of Natural History, Division of Invertebrate Zoology
  • Ward Wheeler, American Museum of Natural History, Department of Invertebrates

The Ohio Supercomputer Center is a catalytic partner of Ohio universities and industries that provides a reliable high performance computing infrastructure for a diverse statewide/regional community. Funded by the Ohio Board of Regents, OSC promotes and stimulates computational research and education in order to act as a key enabler for the state's aspirations in advanced technology, information systems, and advanced industries.

This material is based upon work supported by, or in part by, the U.S. Army Research Laboratory and Office under grant number W911NF-05-1-0271. The Supramap project was also supported by the Defense Advanced Research Projects Agency, Ohio State University, Google.org fund of the Tides Foundation, and the American Museum of Natural History. The article titled "The Supramap project: Linking pathogen genomes with geography to fight emergent infectious diseases" can be found on-line at the Wiley InterScience website. More information on Supramap is available at the Ohio State University website.


Source: Ohio Supercomputer Center

[Medical IT News][Calendar][Virtual Medical Worlds Community][News on Advanced IT]