In search of genes with AlmereGrid

Almere 27 September 2006During the official launch event of the Dutch AlmereGrid project, Danielle Posthuma from the Free University of Amsterdam held a key lecture on her scientific gene marker research which concentrates on the study of twins to determine how differences in genetic structures constitute the basis for differences in physical length, eye colour, sports participation, smoke behaviour, or other physical and social features among human individuals. This type of research is intensively supported by the computational calculations performed within the AlmereGrid infrastructure.


Danielle Posthuma told the audience that physical and behavioural differences between humans are caused by two factors, environmental and genetic distinctions. By environmental factors, she means influences by friends and acquaintances, the socio-economic layer, traumatic experiences, diseases and accidents, food, and related aspects. Her research is focusing on the genetic structure and its 23 chromosomes which constitute the human building blocks. The Human Genome Project that was finished in 2003, defined 3x10 to the ninth base pairs.

If we take a closer look at the human genome, we observe that a gene is encoding for a protein with for instance a function for the bones, the retina, the muscles or the brain. This protein has an impact on the physical length of a person, his or her eye colour, smoking behaviour or sports participation. Different gene variations amount to different proteins and this is causing the differences in the cited human features, as the speaker explained.

Lots of human features are hereditary. Differences in physical length are caused for 90 percent by a difference in the genes. For eye colour this amounts to 99 percent, for sports participation to 62 percent, and for smoking behaviour to 70 percent.

How can we track the genes? Danielle Posthuma noted that the human genome is being divided into so-called genetic markers. Most people show a difference on these markers. Whenever people show similarity on a specific marker and resemble each other in a specific feature, the marker can be found in the neighbourhood of the gene which is responsible for that specific feature.

Danielle Posthuma stated that there are approximately 3500 genetic markers. Needless to say that a lot of calculation time is necessary to run the algorithms. In addition, each analysis has to be run some 1000 times over again with each time a slightly different dataset to be used in order to exclude accidental hits.

On 1 PC, the analysis of one single marker is taking approximately 10 seconds. This takes 35.000 seconds or 9,7 hours for 3500 markers. If the analysis is run a 1000 times, we are talking 9700 hours which comes down to 404 days or more than one year, as Danielle Posthuma counted out before the audience.

This is where Grid computing comes in. If on one single PC, the analysis of one marker takes some 10 seconds and analysing 3500 markers is taking 35.000 seconds or 9,7 hours and for 1000 analyses, more than one year is required, then the use of 100 PC's would only cost 4 days and 1000 PC's only 9,7 hours. Grid computing substantially reduces the calculating times so a breakthrough in genetic research can be made much faster with the help of AlmereGrid.

Daniella Posthuma saw a bright future for further genetic analyses in the areas of intelligence, smoking, sporting, cardiovascular risk factors, body weight, fear and depression and other related fields. Without the use of Grid computing, this type of analysis is very difficult to perform, she stated.

Leslie Versweyveld

[Medical IT News][Calendar][Virtual Medical Worlds Community][News on Advanced IT]