logo
EnterTheGrid - Primeur Live!

EnterTheGrid - Primeur is the premier Grid and Supercomputing information source in the world. With Primeur Live! it brings you Live reports from Europe's main Supercomputing and Grid events

>Primeur Magazine
>PrimeurLive!
>EnterTheGrid
>Analysis
>Backissues
>Calendar
>Subscribe
>Advertise
>Contact
Issue 27 June 2003
>Start
>A new design for supercomputers?
>Focus
>GRIA takes Grid computing into the real world
>It is hard work to keep up with people expecting us to follow Moore's law
>TOP500 supercomputing
>Off-the-shelf supercomputing is a dead end
>Interdependence of architecture and software for effective terascale computing
>Building a PetaFlops class machine for large scale system design experience and biomolecular simulation
>Exploring the benefits of FPGA-processor technology for genome analysis at Acconovis
>Twenty years experience at NAL with software for HPC in aerospace science and engineering
>Software for large-scale computing: it is scalability that matters!
>Can SuperData Centres be secured?
>Complexity of data in the passenger services systems of the DB AG
>Billing of million customers at German Telekom
>The Grid
>Taming huge data volumes
>Company news
>Rapidly evolving microprocessor technology turns throughput computing into alternative for HPC
>Dell introduces 64-Bit server for high-performance computing market
>Efficient network-storage, TCP processing and processor development under the loop at Intel
>AMD Opteron processor answer to tough challenges in high performance computing
Complexity of data in the passenger services systems of the DB AG
Heidelberg 27 June 2003 Lutz Philipp, DB Systems, explained different processes of the Passenger Services of the Deutsche Bahn AG in theory and practice. Presenting on behalf of Karl-Heinz Holzwarth, who could not make it to the conference, he emphasised the development of the data processing systems of the DB AG from a historical point of view, the increasing complexity of data, characterised by the data complexity with the DB AG, the problems in relation to data complexity, presented practical examples and looked towards future developments.
Advertisement
Visit our sponsors
Advertisement
Dolphin's SCI interconnect features the lowest latency and wire speed

The data processing procedures had only been constructed for the particular and relatively narrow areas of application. These applications had been commissioned, developed and operated completely autonomously due to the different divisions of the Deutsche Bahn. Production had thus first developed processes for planning purposes. The procedures to control and monitor production were added quite a long time after. In the second phase DB successively coupled the data processing functions, and this with the goal in mind of designing the internal processes as generally as possible to enhance their efficiency. The "elderly" data available had to be standardised in order to be readable for and understood by all procedures.

Homogeneous data syntax was often impossible and for reasons of shortage in budget as well as time, uniformity was renounced to. This certainly did not lead to simplifying data. Complexity has thus been rather huge, right from the start, for such a large and networked system such as how the railway operation in Germany may be characterised.

In the planning of train schedules relevant data has to be prepared already many years in advance. Production then seeks information about the necessary capacities, which and how many cars and locomotives each train will need. In parallel, the information of the length and availability of the platforms at the stations where the trains stop has to be considered. The staff needs information, the train engineer as well. The train schedule does not begin at the first departure station, but rather already at the switching station. For the train attendants, the train usually begins on the platform as it does for the passengers. However, time for setting up prior to the departure of the train must also be considered and planned for the train attendants.

The staff employment timetables of the DB AG are planned on relatively long terms (usually per half-year). Because the staff consists mostly of people working in shifts, complex staff employment plans are to be expected. Then, Mr. Lutz Philipp discussed the detailed characteristics of data complexity, the very high volume of data, multiple relations of the data to one another, a static data model, dynamic linking, many und varying perspectives of the same data, various life cycles of the applications, frequency of data alterations, static data, data with cyclic alterations, data with short-life cycle and different duration periods of data validity.

The fully coupled network (Railway) with:

7 destinations => 21 relations (half-matrix)

6,000 destinations => 17,997,000 relations (half-matrix)

Number of relations = n * (n-1) /2

The Spoke-and-hub network (Airline):

7 destinations => 6 relations

6,000 destinations => 5,999 relations

Number of relations = n-1

The six thousand German train stops of the DB AG result in about 18 million relations (half of matrix) within a fully coupled network. Each particular relation must be priced and shown individually. A spoke-and-hub network of an airline with 6,000 destinations, in comparison, only needs 5,999 relations. The Passenger Services of the Deutsche Bahn sell approximately 750 thousand tickets every day. The data of each of these tickets needs to be captured and processed. To this effect, a so-called sales data record is produced for each ticket.

All this is, as a matter of fact, handled according to GOB (principles of proper computerised bookkeeping). Every year, a data volume of up to 500 Gigabytes is created merely through the sales data records.

Which are the problems that must now be solved in relation to data complexity? Lutz Philipp presented some problems related to data complexity, for example the availability of huge storage capacity for data in production, archived data, ensuring data consistency and reliability, time-consuming data access due to large data volumes, different data versions, searching for required data within archives and reduction of perspectives on the same data.

Then he discussed some topics of the distribution system KURS'90. The current distribution system for Passenger Services - known within the Deutsche Bahn as the customer-friendly travel, information and sales system, or abbreviated KURS'90 - is the most extensive data processing landscape at the Deutsche Bahn AG and has proved to be the most reliable in its daily use. In the context of the new price system introduced last year, it required numerous practical functions. They were developed by KURS'90, or existing ones were largely modified. In spite of its dense complexity, the distribution system KURS'90 was successfully adapted to the new requirements and went into production without failure.

In order to inform the customer even more about train operations, the RIS procedure - which stands for "Reisenden-Informations-System" and means traveler information system - is currently being set up. It seeks to inform the customer as up-to-date as possible regarding the status of his connections and of possible delays. DB compares target data from the train schedule with the existing actual data of the current operation. The result is shown to the customer. Production and train operation profit from RIS by deducing measures to be taken for the current production control and for traveler information.

This is certainly not the end of our theme. Development will continue to progress for some time. Integrating data is the next level to be conceived in many fields. Besides, further procedures will be developed which are going to need additional data and will contribute to amplifying data complexity even more.
Advertisement
Visit our sponsors
Advertisement
Uwe Harms

EnterTheGrid - Primeur

James Stewartstraat 248

1325 JN Almere

The Netherlands

http://EnterTheGrid.com

mailto:primeur@hoise.com

© EnterTheGrid - Primeur Live!