Batch Systems, LSF, openPBS, PBS and GridEngine
Munich 12 December 2002 As batch processing is essential, the organisers presented talks which covered the spectrum.
Platform LSF
LSF 5 scales up to more than 500 000 jobs, more than 100 clusters and more than 200 000 CPUs. It is now modularised, offers plug-ins, and user scheduler. A basic product is Clusterware, it is dedicated to small (<64 CPUs) 32-bit Linux Clusters. There is open source and an education programme. Clusterware pro offers additional features.
OpenPBS
Research Center Karlsruhe tests it because of its coming DataGrid. Currently, it has a 348 processor grid. OpenPBS 5.2, summer 2002, has problems with heavy loads.
PBSpro
Altair offers PBSpro, which is a result of NASA project (1993-1997) to go out of NQS. In 1999, PBSpro and Globus are interoperable. Since 2000, it is a commercial product. It consists of a Server Daemon, 1 server per cluster, a Scheduler Daemon, one scheduler for unique cluster - it is possible to include external scheduler. The MOM (Machine Oriented Mini-Server) Daemon runs on each execution node and is responsible for accounting and job processing. There are Web-interfaces to CAE computations. The future PBSpro offers fault tolerance and reliability, suspend/resume, Checkpoint/restart, SMP cluster, floating licenses, supports system specific features (e.g. PSSP of IBM SP). It integrates Unix, Linux, Unix, and there is a grid front-end to Globus.
Sun GridEngine
Andrea Lorenz, Computer Center RWTH Aachen, discussed GridEngine from the user's perspective. Its roots lie in 1992, Codine from Genias GmbH. The Sun GridEngine became open source in July 2001. The results are good, only some minor features should be added. With the Sun GridEngine Enterprise Edition (SGEEE) continents can collaborate in one project. Globus can be integrated.
http://www.platform.com
http://www.pbspro.com
http://www.openpbs.org
http://wwws.sun.com/software/gridware
Uwe Harms
[News on Advanced IT]
[Calendar]
[Analysis]
[IT in Medicine]
|