The first goal of the project is to develop a specialized search engine to find antibody mentions in ordinary "not only scientific" texts. Antibodies play an important role in modern pharmaceutical research when, for example, labeling genes and proteins for optical particle tracking. Gene and protein names do not belong to a standardized nomenclature and developed rather organically in the years of their discovery. Of the approximate 500.000 to 1.000.000 human proteins, each protein has an average of five synonyms and often hundreds of spelling variants. Furthermore, there are some proteins that share a common name and for which the context needs to be analysed in order to classify them properly. Transinsight will work closely with antibody vendor Antikörper-Online on this area of the project to develop intelligent matching algorithms and new semantic advertisement technologies.
The second goal is the development of a semantic platform to elucidate gene interaction networks. Transinsight, RESprotect, and TU Dresden will work together to develop methods for the elucidation of BVDU, a highly promising drug against pancreatic cancer, currently in late-stage clinical trial. The end goal is to integrate all textual information with already known data to better understand the activity of the drug and optimize subsequent compounds.
"This is a great opportunity to contribute to the fight against pancreatic cancer, which is one of the most aggressive forms of cancer and, until now, very poorly understood. We will work to help provide a deeper understanding of the disease and drug mechanisms through our exclusive semantic search technologies. We hope this will ultimately result in improvements to current treatments and the development of new drugs", stated Professor Dr. Michael Schroeder from the TU Dresden. Dr. Schroeder is leading the research on the identification of gene and protein names and their interactions. He explained: "Today, the precision of known methods is as small as 30 percent. Our goal is to push this number to 90 percent and offer a more practical and accurate system."
Transinsight will be spearheading the consortium for the next two years. Michael R. Alvers, CEO of Transinsight, stated: "We are proud to work with such great partners in developing technologies to advance scientific work in areas as important as antibody and pancreatic cancer research. We are confident we will set important semantic landmarks and bring to international attention the competitiveness and leading-edge of German start-ups. The work we did under Theseus One provides an ideal base to build on and will allow us to move quickly. We expect we will soon be able to show achievements that go even beyond life sciences."
Transinsight develops knowledge-based semantic solutions in the Life Sciences. Their flagship products www.Go3R.org and www.GoPubMed.com, renowned biomedical search engines, are the first knowledge-based search systems of the next generation for the Life Sciences on the internet. In acknowledgement of the technologies developed by the company, Transinsight has repeatedly been honoured with international awards. The firm works in close collaboration with the Dresden University of Technology. Selected customers are Unilever, BASF, BfR, StatoilHydro, Wintershall, Abcam and EMBL. More company news is available in the VMW January 2009 article At European Venture Summit Transinsight launches new version of GoPubMed and wins as one of the top 5 European biotech start-ups.
Theseus is a German research programme initiated by the Federal Ministry of Economics and Technology (BMWi) and is aimed at the development of an new internet-based knowledge infrastructure, for improving the utilisation and exploitation of knowledge via the internet. The focus of the research programme is on semantic technologies that can identify content - words, images, sounds - not with the assistance of conventional processes (e.g. letter combinations), but instead that can recognise the meaning of the content of information and classify it. With this technology, computer programmes can understand in what context data should be stored. Furthermore, by applying the rules and order principles, computers can draw logical conclusions about content and recognise and create independent correlations between various pieces of information from several sources.
The improved use and exploitation of digital knowledge - that is the aim of the Theseus Project. In the future semantic technologies will be able to recognise the meaning of information content. "The society of the future will be even more knowledge-based than the present one. For that reason it is not only necessary to create the appropriate infrastructure, but also to ensure that existing knowledge is suitably prepared and made recognisable", explained Professor Hans-Joachim Grallert, Head of the Fraunhofer-Institute for Telecommunications, Heinrich Hertz Institute HHI. In the joint project, entitled Theseus, 31 German companies, universities and research institutes have banded together with the aim of enabling the improved use and exploitation of digital knowledge. Nine Fraunhofer Institutes are taking part in this project.
The HHI is co-ordinating the development of the base technologies for Theseus. The focus here is on semantic technologies for the next generation of the internet, which will recognise the meaning of information content and be able to classify it - irrespective of whether it be words, photos, sounds, 2D and 3D image data. With these technologies, computer programmes will be able to intelligently understand in what context data should be stored, as well as draw logistical conclusions and establish correlations.
In libraries, broadcasting institutions, archives, museums and databases the wealth of knowledge of our society lies slumbering. But how to better utilise it as well as make it accessible to a wide audience? Researchers from the Fraunhofer Institutes for Intelligent Analysis and Information Systems IAIS are working on the digitalisation of media types such as text, images and sound recordings and to connect the data semantically - that is to say in accordance with its contextual meaning - to an innovative knowledge network. This will enable people looking for information to undertake searches more easily.
To achieve the best-possible digitalisation results, HHI researchers are developing algorithms for the restoration of text and video data. These automatically detect and rectify faulty data reliably. They can, for example, optimise yellowed pages for text recognition or clean historical film recordings of dust and scratches. New quality analysis processes for images and videos will in the future actually be able to automatically detect defects such as dust or scratches in images and videos or identify quality characteristics such as sharpness and contrast. This information serves as the basis for the digital restoration of the material or for searching qualitative high-grade content.
Up to now the searching of video and photo archives has been particularly time-consuming and laborious. In the future, metadata - a type of specification of content - for multimedia data will be automatically generated and thereby simplify the search. Taking images for example: researchers are developing image recognition systems that utilise colours or structures in the image - textures - to make inferences regarding the content. This would enable a computer to identify, by way of example, sun, blue ocean or geometric forms such as beach chairs and to store the image content in metadata.
New possibilities are offered by the intelligent searching of images in medicine. In a sub-project called Medico, the development efforts of experts from the Fraunhofer Institute for Computer Graphics Research (IGD), include FIRST computer and software technology and IAIS tools for the automatic statistic evaluation of medical image data such as computed tomography images, for example. This will enable the future close matching of image characteristics to the symptoms of disease. The procedure will allow images of one patient to be compared at lightning speed with the images of thousands of other patients. This makes it easier for physicians to make a diagnosis.
The IGD researchers are working on the functional graphical interface of Theseus. "The knowledge infrastructure of the future will enable the user to intuitively and quickly find and assess all the services required for a particular subject", explained Nadeem Bhatti, research assistant at the IGD. Using the graphical interface it is possible, for example, to identify all services on topics such as "Calculation of the eco-balance of products" or "medical diagnoses". The underlying data will be retrieved and processed with the assistance of Theseus technologies.