logo

EnterTheGrid - PrimeurWeekly

EnterTheGrid - PrimeurMagazine is the largest Grid and Supercomputer information source in the world. PrimeurWeekly delivers the news each week in your e-mail box.

>PrimeurMagazine
>PrimeurLive!
>EnterTheGrid
>Analysis
>Backissues
>Calendar
>Subscribe
>Advertise
>Contact
PrimeurWeekly 18 June 2007
Evergrid launches global resource management solution for next generation data centres
Fremont 12 June 2007 Evergrid has launched the Evergrid Cluster Availability Management Suite (CAMS), a new continuous availability and resource management software solution for High Productivity Computing Grid environments and the Utility Enterprise data centre. CAMS manages server clusters from power-on through operating system provisioning and application scheduling to load management. CAMS is integrated with Evergrid's Availability Management Service (AvS) to provide checkpoint/resume capabilities for applications, including massively parallel distributed applications. With CAMS, batch applications run at near 100-percent reliability.
Advertisement
Visit our sponsors
Advertisement
Visit our sponsors

Evergrid provides transparent fault tolerance using an OS abstraction layer that loads between the operating system (OS) and the application. Without modifying either the application or the operating system, CAMS/AvS periodically captures the collective state of the application across the entire infrastructure while the application continues processing. By recording the state of an application and all of the OS and system state, Evergrid is able to checkpoint and resume from failures or interruptions rapidly with minimal overhead. Even failure of multiple servers or of software systems does not stop an application from being able to resume processing from a checkpoint.

Evergrid provides recovery especially for long-running, multi-server batch jobs that are limited in their runtime by the inherent reliability characteristics of software and hardware. The patented checkpoint/resume technology also allows transparent stateful job preemption and application migration of batch workloads on multiple servers. Moreover, recovery and pre-emptive scheduling for applications can be done globally, scaling across geographically dispersed data centers.

"What differentiates Evergrid from other solutions that attempt to solve the checkpoint problem is our ability to scale up to thousands of nodes, with less than five percent performance overhead and without OS or application changes", stated Dave Anderson, CEO of Evergrid. "You can't get this capability anywhere else."

Evergrid Cluster Availability Management Suite (CAMS) is comprised of two products, Evergrid Availability Services (AvS-Batch) and Evergrid Resource Manager (RM-Batch). Evergrid AvS-Batch captures the collective state of single or multiple nodes running distributed applications and prevents downtime by performing checkpoint, migration and recovery of the application, thus providing automatic failover across multiple nodes and tiers. Evergrid RM-Batch allows efficient allocation of resources and stateful preemptive scheduling of jobs. CAMS ensures that no compute cycle is lost by recovering, migrating or pre-empting jobs. This translates to greater flexibility, reliability and utilization of computing resources.

"Software solutions that minimize downtime for compute-intensive applications, improve job execution, and minimize job pre-emption while maximizing utilization of servers will fundamentally change how we serve our user community", stated Henry Neeman, director of the OU Supercomputing Center for Education & Research (OSCER) at the University of Oklahoma.

Evergrid's software is designed for demanding, computing-intensive sectors such as manufacturing, financial services, and pharmaceutical and petrochemical research. Currently, Evergrid solutions target High Performance Technical Computing (HPTC) applications that are computationally intensive and use high speed interconnects. In the near future, Evergrid will also provide solutions for the High Performance Enterprise Computing (HPEC) and on-line transaction processing (OLTP) database and enterprise application markets.

Evergrid licenses its Cluster Availability Management Suite software on a per-socket, annual subscription basis, with substantial discounts for large deployments. Evergrid's Availability Management Service can be licensed separately for integration with other resource managers. Currently CAMS and AvS are implemented on multiple versions of Linux. Both Cluster Availability Management Suite and Availability Management Services are available immediately from Evergrid.
Advertisement
Advertisement
Source: Evergrid

EnterTheGrid - Primeur

James Stewartstraat 248

1325 JN Almere

The Netherlands

http://hoise.com/primeur

mailto:primeur@hoise.com

© EnterTheGrid - PrimeurWeekly