Put the protein pieces together with algorithms: Solving 'the mass spec data mess'

San Diego 17 October 2008A new proteomics project promises to revolutionize routine blood tests, vaccine development, cancer diagnostics, and many other important biomedical challenges. University of California San Diego (UCSD) engineers and scientists have received a five-year $4,94 million grant from the National Center for Research Resources (NCRR), a part of the National Institutes of Health (NIH), to develop algorithms and software for deciphering all the proteins that are present in biological samples. This "proteomics" work is led by Pavel Pevzner, a UC San Diego Jacobs School of Engineering computer science professor.

Advertisement

The new grant will also support development of the software infrastructure required to share these cutting edge computational mass spectrometry tools with researchers around the nation and the world. This effort will combat a global computational bottleneck that is currently holding back the field of proteomics, which by definition strives to glean biological insights from looking at all the proteins present in biological samples. While there are traditional tools to do some of this proteomics work, they are time consuming and expensive and have contributed to the computational bottleneck.

"Unanalysed data from mass spectrometers is piling up in laboratories around the world. Our algorithms can turn much of these 'dark' data into the lists of modified proteins that researchers are looking for", stated Nuno Bandeira, the first executive director of the Center for Computational Mass Spectrometry at UCSD's Jacobs School of Engineering, which is made possible by the new grant.

A wide variety of biomedical research projects will benefit from development of these computational resources including:

  • Elucidation of cancer biomarkers
  • Extensive characterization of changes in aged cataractous lenses
  • Understanding how bacteria adjust to antibiotics and other harsh conditions
  • Addressing the need to constantly reformulate the vaccines to make them efficient
  • De novo protein sequencing of antibodies and snake venoms that proved instrumental in drug design

UCSD bioinformatics experts have already pioneered computational methods for teasing out exactly what proteins are in biological samples such as blood and cancer tumours and have published extensively on this work. While they have only been able to share these tools with close collaborators so far, the $4,94 million from NCRR will fund further development of the algorithms, as well as the software and computational infrastructure that will enable the researchers to offer their computational services to researchers at UCSD and around the world through open-access software platforms.

Key collaborators on the new grant are Jacobs School of Engineering computer science professors Vineet Bafna and Ingolf Krueger as well as Steven Briggs, a professor of biology at UCSD's Division of Biological Sciences. Vineet Bafna will oversee the development of algorithms for peptide identification - including modifications, proteogenomics, and protein quantification.

Ingolf Krueger will lead the team that is developing the service-oriented software architecture to enable the robust integration of proteomics research tools in an integrated public service. Ingolf Krueger's software expertise will allow for synergistic interactions with other available proteomics tools as well as Web service, data and compute clusters to form a community cyberinfrastructure for proteomics research and applications. Ingolf Krueger directs the "Software & Systems Architecture & Integration" (SAINT) functional area at Calit2.

Nuno Bandeira will develop new algorithms for revealing the modified proteome and its myriad interactions. As executive director of the centre, Nuno Bandeira will also co-ordinate development of the infrastructure to easily transition research-grade software to user-friendly tools accessible to biologists worldwide.

The tools that the researchers are looking to further develop and share with the world have been and continue to be developed at UCSD's Jacobs School of Engineering and Calit2 primarily, in collaboration with scientists from the UCSD Division of Biological Sciences, the UCSD School of Medicine, and the UCSD Skaggs School of Pharmacy and Pharmaceutical Sciences. Nuno Bandeira, for example, did some of this work while looking for better ways to sequence the proteins in snake venom as a part of his UCSD computer science Ph.D.

The insights that have arisen from the snake venom sequencing and other work at UCSD may even change the way the pharmaceutical industry generates antibody drugs. Today's primary approach to sequencing antibodies are low-throughput and labour intensive Edman degradation techniques. "We are proposing to completely replace this approach with Shotgun Protein Sequencing, which is a combination of software and experimental protocols that capitalizes on fast-developing high-throughput mass spectrometry and automatically sequences mixtures of proteins", stated Nuno Bandeira.

This ability to quickly determine antibody sequences and automatically characterize their diversity has the potential to further accelerate discovery and facilitate engineering and manufacturing processes.

Blood tests could change as well. Today's blood tests generally track just a small number of proteins even through there are thousands of proteins in any blood sample that could provide important information about a person's health. Moreover, each of these proteins can be modified or simply cut somewhere in the middle and uncovering these modifications provides important clues about the health of the individual.

Decoding all the proteins in a blood sample is a difficult computational puzzle that still awaits an automated solution - and UC San Diego's computational mass spectrometry experts are working to resolve this bottleneck.

The new centre will be highly interdisciplinary. Important collaborations already exist at UC San Diego, the Burnham Institute, 16 United States universities, as well as hospitals, biotechnology companies, and foreign research institutions. Further development of robust open-access mass spectrometry software will catalyze the exchanges between experimental and computational researchers in proteomics. The researchers will also develop educational activities including short courses, a seminar programme, and an annual conference.


Leslie Versweyveld

[Medical IT News][Calendar][Virtual Medical Worlds Community][News on Advanced IT]