The Esprit project PHASE aims at developing and implementing a Distributed Pharmaceutical Applications Server in order to deliver the power of high-performance-computing onto the desks of small and medium sized industrial companies operating in the field of drug target identification and modelling. The main advantage for the users of leased high-performance processing lies in the easy access and the less prohibitive costs. PHASE not only provides the server architecture allowing introduction of job requirement specifications and submission to an automated load-balancing scheme that distributes the parallel tasks to the best suited hardware platform. It also offers a unified work-flow for the bio-informatics applications GeneQuiz, MaxHom, DRAGON and MSAP which all four contribute to the analysis of protein sequence and structure.
Drug design constitutes a long and expensive itinerary from initial drug targeting to years of clinical testing. At present, researchers are faced with an overwhelming amount of data generated in the various genome projects. Consequently, supercomputing resources via the Internet for compute-intensive production runs as well as in-house workstation clusters for the more sensitive data are more than welcomed to improve the speed and scope of the drug discovery process. Interactive simulation jobs suddenly become realistic and provoke repercussions upon the heavy industrial competition in patent acquisition. It is now possible to approach a problem from different angles in order to detect similarities leading up to the specification of new drug targets.
An end-user can enter the Distributed Pharmaceutical Application Server through a Graphical User Interface based on HTML and Java. At the UK based European Bio-Informatics Institute, the Central PHASE Access Point collects the incoming requests to check and register them in the Accounting Database. A list of currently accessible resources is kept in a Configuration Database. The incoming tasks are scanned via a Request Analyser in order to determine their hard and software requirements after which they are forwarded to the best suited platform through a Dynamic Job Distributor. Distributed Load Monitor Daemons offer information on the current system and network performance while cluster management software is used as an access point to the distributed resources. The accounting and configuration databases are situated on the Computing Centre Software of the Paderborn University, coordinator of the PHASE project.
The four bio-informatics applications GeneQuiz, MaxHom, DRAGON and MSAP are used as assisting tools at different levels in the identification of the gene sequence's function, a necessary step in becoming a potential drug target. Moreover, these four musketeers also contribute to the generation of meaningful 3-D models of proteins. Basically, GeneQuiz assigns a potential function to large numbers of genes by means of homology or clear similarity to a sequence of known function. Maxhom is used for cases where the homology is relatively low. The generation of 3-D models is executed by DRAGON and MSAP compares 3-D structures of proteins in order to identify similarity with a known protein structure. The four applications are driven by either a command-line or a graphical user interface.
A two-level security system protects the PHASE Internet version by means of both standard Kerberos/Netscape mechanisms and Unix fire walls guarding the e-mail access to trusted servers. PHASE initially has been focusing on the pharmaceutical sector but in the future, the server software will be extended towards other application areas. Please consult the PHASE site for more news on the project's evolution.