In the field of oncology, like in other medical specialities, experts experience a great need to generate universal standards for the classification and description of tumour diseases. In order to have the doctor adequately deal with these standards in a variety of contexts, a common data dictionary would largely facilitate life. In Germany, there currently exist two sorts of common standards for oncologic documentation, according to the scientific team from Gießen. The first one has been defined for the Association of German Cancer Centres and is called "Basisdokumentation für Tumorkranke". The second one is an organ specific data set, referred to as "Organspezifische Tumordokumentation".
The latter is based on official, international coding systems, known as the International Classification of Diseases for Oncology (ICD-O) for topography and morphology, and TNM classification of Malignant Tumours as well as classifications for acute and chronic side effects. This type of dictionaries requires the integration of existing documents which first are hierarchically decomposed to be presented in a uniform way to the user, in alignment with real patient data models. Full length definitions and descriptions are of vital importance in this speciality where the coding and classification work often is a matter of teamwork and comparison.
Because of these specific problems, the team from the Institute of Medical Informatics felt attracted to the huge potential of information structuring, offered by the new Extensible Markup Language (XML). Right now, XML is already being used for the transmission of hierarchically structured patient records via the Internet. Yet, there remain a few difficult hurdles to take. In this regard, the team is confronted with a large number of regional variants in the terminology, whereas the official acceptance of new terms is lagging behind because of the slow reviewing and publication process. A somewhat different problem constitutes the great amount of descriptive and definition text, in turn referring to other codes and items.
The multiple references make it almost necessary to introduce elements such as store formatting data. In addition, there are to date no standards for the streamlining of XML-based patient records, such as the specific semantic of a medical domain. This equally involves that different user systems have to be able to retrieve and interpret data from multiple sources. As a result, the team has designed a three-tier framework, consisting of a reference model of tumour disease documentation; a dictionary of data items; and a library of documents and document components.
The reference model in fact forms a collection of information objects, such as patient, history, tumour, therapy, ... The dictionary entries constitute items, codes, relationships, definitions, and descriptive text fragments. The library of documents and document components relates to entry forms or texts with standards, which are similar to data items but on a larger scale. The team has decided to apply XML to this three-tier framework because of the flexible way in which the information is being serialised. Because of validity checks, the researchers have selected a mixed type of document type definition (DTD) which is expressive towards the reference model but much more flexible as far as the items, related to its objects, are concerned.
Although not fully installed to this point yet, the XML-based framework has been served already as implicit domain information model for the definition of a communication standard for oncology based on a tagged data exchange format, applied by general practitioners in Germany. This format is called "BehandlungsDatenTräger" (BDT) and is similar to XML. In addition, a prototype for one organ from the "Organspezifische Tumordokumentation" has been designed. The documentation for the patient's clinical starting position in a medical trial thus can be performed by means of an XML-based browser page. The patient data can be consulted in XML afterwards.