Printable Version of this PageHome PageRecent ChangesSearchSign In
Tag:
I attended a talk regarding SPRUCE and urgent computing on the TeraGrid at NCAR sponsored by Henry Tufo. Though it wasn’t specifically a department colloquium it was open to both NCAR and CU. The presenter was Pete Beckman from Argonne National Labs.

The general idea behind urgent computing is to provide immediate access to resources for applications that need to be run at that time. Such applications might be the simulation of tornados or hurricanes or the simulation of the movement of air in a bio-attack or chemical leak. These are the sorts of things that require large amounts of processing power yet can’t afford to wait in a queue on a supercomputer for days until the job finally runs.

Argonne is developing SPRUCE to solve this problem. SPRUCE allows users with clearance to run jobs on supercomputers as needed, and in some cases allow them to bypass the entire queue. Pete mentioned a number of critical issues that must be dealt with, in particular jobs that are currently running (or waiting in the queue) must be preempted by the urgent job. Along with that is the issue of crediting the users whose jobs were lost because of the preemption. Pete suggested methods for handling such situations (and how Argonne handles them), however, he also pointed out that the handling of such situations needs to be decided on a site-by-site basis.

Last modified 10 December 2007 at 7:13 pm by paul.marshall