Printable Version of this PageHome PageRecent ChangesSearchSign In
A Tool for Deploying and Managing Large Scientific Workflows on Distributed and Heterogeneous Grid Resources

I envision my Ph.D. thesis to be a set of protocols and algorithms for deploying and managing large scientific workflows on distributed, heterogeneous Grid resources. Along with designing the protocols and algorithms I would also create a tool, or set of tools that could be deployed on a large Grid, such as the TeraGrid, and demonstrate the feasibility and usefulness of my ideas on a real-world project.

One of the main problems with extensive, distributed, and heterogeneous Grids is the ability to easily deploy and manage large-scale applications that require a wide variety of resources at numerous sites. This problem is magnified if the application requires extensive configuration and setup at each of the sites. The application, or portions of the application must be distributed to the remote sites where they can be executed. Once delivered the application must be compiled (possibly with different optimizations) on the different architectures. Different portions of the application may need to be configured differently and integrated with the underlying Grid software (e.g. the Globus toolkit or a scheduler like GridWay).

In combination with the issues above there is also the problem of deciding where to send the different portions of the application which resources are best suited to handle large amounts of computation, or process the input, write the output, etc.? The scheduling algorithms to determine such things could be integrated into the tool, or they could be used in conjunction with another scheduling system, which would determine where to perform the various pieces of the workflow.

I propose a system to address the problems mentioned in a simple and straightforward manner for the user, all of the technical details for distributing, compiling, configuring, and executing a large and complex scientific application on a variety of different architectures at a number of different sites would be removed.

A primary goal of my thesis would not only be to design such a system, but also to build a real tool that integrates with a large Grid and demonstrate its effectiveness on a real project. I believe that new problems can arise when a theoretical design or system is actually placed into a production environment and tested with users.

Last modified 10 December 2007 at 7:10 pm by paul.marshall