A paper describing the Cancer Genomics Browser has been published in the April issue of Nature Methods by a team based at the Jack Baskin School of Engineering at UCSC. Co-author David Haussler, professor of biomolecular engineering, said development of the browser was driven by the needs of cancer researchers, who are now using powerful technologies for genome analysis and DNA sequencing in their efforts to understand cancer at the molecular level.
"Each of these tests gives millions of measurements, and the result is a bad case of data overload", David Haussler stated. "We've built the cancer browser so that researchers can upload their data and use a variety of software tools to visualize and interpret their results."
To get a user's perspective on the browser as it took shape, David Haussler's team worked closely with Dr. Laura Esserman, professor of surgery and radiology at the University of California, San Francisco (UCSF), and Marc Lenburg, associate professor of pathology and laboratory medicine at Boston University School of Medicine. Laura Esserman and Marc Lenburg, both co-authors of the paper, are involved in the I-SPY Trial, a multi-institutional collaboration aimed at identifying biomarkers to predict the most effective therapies for patients with advanced breast cancer.
"What is amazing about the browser is that it allows us to combine complex molecular data and clinical observations, and provides insights into how we can truly improve treatment and outcomes", stated Laura Esserman, director of the Carol Franc Buck Breast Care Center and associate director of the Breast Oncology Programme at the Helen Diller Family Comprehensive Cancer Center at UCSF.
Cancer genomics involves searching for all of the genes and mutations that contribute to the development of a cancer cell and its progression from a localized cancer to metastatic disease that spreads throughout the body. A genome is an organism's complete set of DNA, and researchers are now able to analyse the alterations that occur throughout the genome of a patient's cancer cells. Recent advances, such as micro-array technology and high-throughput DNA sequencing, have made it possible to characterize tumour samples in exquisite detail.
"You can run a micro-array chip that analyses a million points in the genome and can tell you about changes in the DNA, as well as inherited variations that make a person more or less susceptible to cancer", David Haussler stated.
Many different types of genomic changes can have clinical significance, including insertions, deletions, and other changes in the DNA sequence, such as changes in the number of copies of a gene. Moreover, micro-arrays and high-throughput methods for measuring proteins make it possible to see how these genomic alterations interfere with the cell's normal workings.
"The Cancer Genomics Browser is fantastic in that it helps users display many different dimensions of clinical and molecular data simultaneously", Marc Lenburg stated. "For example, for a given set of tumour biopsies, it is possible to see which regions of the genome are abnormal, how much of every gene is being expressed, how active various signaling pathways are - all organized by, say, how well each patient responded to a particular drug. As a result, the process of identifying possible connections is really easy."
The browser was developed by a team of scientists at UCSC's Center for Biomolecular Science and Engineering (CBSE), an interdisciplinary centre housed in the Baskin School of Engineering and directed by David Haussler. Ting Wang, a Helen Hay Whitney postdoctoral fellow, came up with the initial design of the browser and co-ordinated the team's efforts. The first three authors of the paper - postdoctoral researcher Jingchun Zhu and graduate students Zachary Sanborn and Stephen Benz - did much of the work involved in building the browser, with help from CBSE research scientist James Kent and others.
The public browser site hosts a growing body of publicly available cancer genomic data, and the browser is also being used on confidential, pre-publication data by several groups involved in clinical trials and cancer genomics research, Ting Wang said. The Cancer Genomics Browser is a natural extension of the UCSC Genome Browser, a widely used platform for accessing and visualizing genomic data. Created by James Kent as a tool for exploring the human genome, the UCSC Genome Browser now averages one million page requests every week. It displays data and annotations in linear tracks that parallel the DNA sequences of the dozens of genomes in the browser.
But this type of display doesn't work well with clinical data from large numbers of patients. And clinical databases don't handle genomic data very well. The Cancer Genomics Browser is able to integrate these different types of data into a single interactive display. "Large clinical trials that include detailed molecular profiling of patient samples generate a really big mountain of data. Actually, it is more like several big mountains of data", Marc Lenburg stated. "The browser creates a way of organizing all this data, and all these different types of data, into a single unified picture."
The Cancer Genomics Browser represents data as "heatmaps", in which colours represent the values of key variables. Genomic and clinical data are displayed side by side, and researchers can group and sort the data on the basis of any feature of interest, such as age, gender, response to therapy, estrogen-receptor status of breast cancers, and so on. Because humans excel at visual pattern recognition, correlations in the data tend to jump out as the user manipulates the browser display.
"The ideas behind it are simple, but the result is a pretty powerful tool. It makes it a lot easier to see patterns in the data", Ting Wang stated. Standard statistical tools are integrated into the browser so that users can perform quantitative analyses. The browser's developers hope to improve these capabilities in the future. "Now that we have the platform, we want to incorporate state-of-the-art algorithms to get the most out of the data", Ting Wang stated.
In developing the browser, the researchers used pre-publication datasets from the I-SPY Trial - Investigation of Serial Studies to Predict Your Therapeutic Response with Imaging and Molecular Analysis - and The Cancer Genome Atlas (TCGA). The I-SPY study is funded by the National Cancer Institute (NCI) and includes nine cancer centres nationwide. TCGA is a large-scale collaborative effort by NCI and the National Human Genome Research Institute to systematically characterize the genomic changes that occur in cancer. The UCSC team is also working with a related worldwide effort, the International Cancer Genome Consortium.
The co-authors of the Nature Methods paper include UCSC researchers Christopher Szeto, Fan Hsu, Robert Kuhn, Donna Karolchik, and John Archie, in addition to Jingchun Zhu, Zachary Sanborn, Stephen Benz, Marc Lenburg, Laura Esserman, James Kent, David Haussler, and Ting Wang. Funding for this project was provided by the I-SPY consortium, the TCGA consortium, the California Institute for Quantitative Biosciences (QB3), and the National Institutes of Health. David Haussler is a Howard Hughes Medical Institute investigator. The Cancer Genomics Browser is available at the University of California, Santa Cruz website.