logo
EnterTheGrid - Primeur Live!

EnterTheGrid - Primeur is the premier Grid and Supercomputing information source in the world. With Primeur Live! it brings you Live reports from Europe's main Supercomputing and Grid events

>Primeur Magazine
>PrimeurLive!
>EnterTheGrid
>Analysis
>Backissues
>Calendar
>Subscribe
>Advertise
>Contact
Issue 27 June 2003
>Start
>A new design for supercomputers?
>Focus
>GRIA takes Grid computing into the real world
>It is hard work to keep up with people expecting us to follow Moore's law
>TOP500 supercomputing
>Off-the-shelf supercomputing is a dead end
>Interdependence of architecture and software for effective terascale computing
>Building a PetaFlops class machine for large scale system design experience and biomolecular simulation
>Exploring the benefits of FPGA-processor technology for genome analysis at Acconovis
>Twenty years experience at NAL with software for HPC in aerospace science and engineering
>Software for large-scale computing: it is scalability that matters!
>Can SuperData Centres be secured?
>Complexity of data in the passenger services systems of the DB AG
>Billing of million customers at German Telekom
>The Grid
>Taming huge data volumes
>Company news
>Rapidly evolving microprocessor technology turns throughput computing into alternative for HPC
>Dell introduces 64-Bit server for high-performance computing market
>Efficient network-storage, TCP processing and processor development under the loop at Intel
>AMD Opteron processor answer to tough challenges in high performance computing
Taming huge data volumes
Heidelberg 21 June 2003 Sverre Jarp, Cern, discussed the topic "Storing and Processing of Huge Experimental Data at CERN". There he described the requirements of High Energy Physics, the computing characteristics and the planning of storing, distibuting and processing of the huge data volumes, the Large Hadron Collider will deliver when he will be active.
Advertisement
Dolphin's SCI interconnect features the lowest latency and wire speed
Advertisement
Visit our sponsors

The computing characteristics of high Energy Physics consists of independent events (collisions), easy (read: trivial) parallel processing, the bulk of the data is read-only, versions rather than updates, meta-data in databases linking to files, the compute power is measured in SPECint (not SPECfp). There are very large aggregate requirements, computation, data, input/output, chaotic workload in the research environment, physics extracted by iterative analysis, collaborating groups of physicists and unpredictable, unlimited demand.

The Large Hadron Collider (LHC) has the goal: "Find new physics, such as the Higgs particle, and get the Nobel price !", were Sverre Jarp's comments.

The Data Acquisition is characterised by:

  • Multi-level trigger
  • Filters out background
  • Reduces data volume
  • Record data 24 hours a day, 7 days a week
  • Equivalent to writing a CD every 2 seconds

CERN's responsibilities as a large data centre:

  • Manage home directories
  • With worldwide access
  • Transfer data from experimental areas
  • Record to tape
  • Re-export "raw" data to other physics labs
  • Manage the physics data at every level in the analysis chain
  • Data accessed locally
  • Data accessed via the "GRID"

CASTOR (CERN Advanced STORage Manager)

The hierarchical storage manager used to store user and physics files and manages the secondary and tertiary storage. Currently it holds more than 9 million files and 2000 TB of data. The development started in 1999 based on SHIFT, CERN's tape and disk management system since the beginning of the 1990s. SHIFT architecture (Scalable Heterogeneous Integrated Facility) connects tape server, disk server, batch server, interactive server, batch and disk SMP via a network based on Ethernet and AFS (Andrew File Systems).

The main characteristics of Castor are:

  • Modularity

the components in CASTOR have well defined roles and interfaces, it is possible to change a component without affecting the whole system

  • Highly Distributed System

CERN uses a very distributed configuration with many disk servers/tape servers and can also run in more limited environment

  • Scalability

the number of disk servers, tape servers, name servers is unlimited. The use of RDBMS (Oracle, MySQL) improves the scalability of some critical components

  • Central servers

consist of name server, volume manager, volume and drive queue manager

  • Disk subsystem
  • Tape Subsystem

Then he mentioned openlab, a technology focus industrial collaboration with Enterasys, HP, IBM, and Intel as the partners. The technology is aimed at the LHC era, network switch at 10 Gigabits, rack-mounted HP servers, Itanium-2 processors, and a StorageTank storage system. He expects a cluster evolution with a cluster of 32 systems (64 processors) in 2002, in 2003 64 systems ("Madison" processors), and in 2004/05 possibly 128 systems ("Montecito" processors). The opencluster result is the "10 Gbit/s network" challenge, groups together three Openlab partners and CERN, current results are a single stream tcp/ip connection: 755 MB/s over a 10 km fibre, back-to-back, memory-to-memory.

Grid = Virtual Computing Center

This seems the solution for LHC, as the user sees the image of a single cluster and does not need to know, where the data is, where the processing capacity is, how things are interconnected, the details of the different hardware and is not concerned by the local policies of the equipment owners and managers. The vision of Grid Data Management is distributed shared data storage, ubiquitous data access, transparent data transfer and migration, consistency and robustness as wel as optimisation.

CERN is busily preparing for the first arrival of LHC data in 2007. New and exciting technologies are needed to manage the data seamlessly, around the globe. Together with the partners (industry, other Physics Labs, other sciences) CERN expects to come up with interesting proofs-of-concept and technological spin-off ! Petabyte Data Centers are here to stay !
Advertisement
Advertisement
Visit our sponsors
Uwe Harms

EnterTheGrid - Primeur

James Stewartstraat 248

1325 JN Almere

The Netherlands

http://EnterTheGrid.com

mailto:primeur@hoise.com

© EnterTheGrid - Primeur Live!