The new process focuses on missense mutations, meaning protein sequences that each possess a single tiny variation from the normal pattern. A small percentage of these genetic errors can reduce the activity of proteins that usually suppress tumours or hyperactivate proteins that make it easier for tumours to grow, thereby allowing cancer to develop and spread. But finding these genetic offenders can be difficult.
"It's very expensive and time-consuming to test a huge number of gene mutations, trying to find the few that have a solid link to cancer", stated Rachel Karchin, an assistant professor of biomedical engineering who supervised the development of the computational sorting approach. "Our new screening system should dramatically speed up efforts to identify genetic cancer risk factors and help find new targets for cancer-fighting medications."
The new computational method is called CHASM, short for Cancer-specific High-throughput Annotation of Somatic Mutations.
Developing this system required a partnership of researchers from diverse disciplines. Rachel Karchin and doctoral student Hannah Carter drew on their skills as members of the university's Institute for Computational Medicine, which uses powerful information management and computing technologies to address important health problems, and collaborated with leading Johns Hopkins cancer and biostatistics experts from the university's School of Medicine, its Bloomberg School of Public Health and the Johns Hopkins Kimmel Cancer Center.
The team first narrowed the field of about 600 potential brain cancer culprits using a computational method that would sort these mutations into "drivers" and "passengers". Driver mutations are those that initiate or promote the growth of tumours. Passenger mutations are those that are present when a tumour forms but appear to play no role in its formation or growth. In other words, the passenger mutations are only along for the ride.
To prepare for the sorting, the researchers used a machine-learning technique in which about 50 characteristics or properties associated with cancer-causing mutations were given numerical values and programmed into the system. Rachel Karchin and Hannah Carter then employed a math technique called a Random Forest classifier to help separate and rank the drivers and the passengers. In this step, 500 computational "decision trees" considered each mutation to decide whether it possessed the key characteristics associated with promoting cancer. Eventually, each "tree" cast a vote: Was the gene a driver or a passenger?
"It's a little like the children's game of 'Guess Who', where you ask a series of yes or no questions to eliminate certain people until you narrow it down to a few remaining suspects", stated Hannah Carter, who earned her undergraduate and master's degrees at the University of Louisville and served as lead author of the Cancer Research paper. "In this case, the decision trees asked questions to figure out which mutations were most likely to be implicated in cancer."
The election results - such as how many driver votes a mutation received - were used to produce a ranking. The genetic errors that collected the most driver votes wound up at the top of the list. The ones with the most passenger votes were placed near the bottom. With a list like this in hand, the software developers said, cancer researchers can direct more of their time and energy to the mutations at the top of the rankings.
Rachel Karchin and Hannah Carter plan to post their system on the web and will allow researchers worldwide to use it freely to prioritize their studies. Because different genetic characteristics are associated with different types of cancers, they said the method can easily be adapted to rank the mutations that may be linked to different forms of the disease, such as breast cancer or lung cancer.
In addition to Rachel Karchin and Hannah Carter, the Johns Hopkins co-authors of the Cancer Research paper were Sining Chen, Leyla Isik, Svitlana Tyekucheva, Victor E. Velculescu, Kenneth W. Kinzler and Bert Vogelstein. Funding for the research was provided by the National Cancer Institute, the Susan G. Komen Foundation, the Virginia and D.K. Ludwig Fund for Cancer Research and the National Institutes of Health.