Globus Turns 10: Time for Celebration and Reflection
San Diego,CA (09-12-2006)
From GRID today
By Ian Foster
The GlobusWORLD conference being held (jointly with GridWorld and the Open Grid Forum) this week in Washington, D.C., is a significant milestone for those involved in the development and use of the Globus open source Grid software. The reason is that it was 10 years ago (to be precise, on Aug. 21, 1996) that Carl Kesselman and I received our first funding for work on Globus, from DARPA. Gary Minden and Mike St. Johns were our enlightened program managers, followed by Gary Koob. I must also recognize the support of Bob Aiken, Tom Kitchens and, especially, Mary Anne Scott, then all at DoE.
Given this milestone, I will spend some time here recapping history and reflecting on where we have come and what we have learned.
A Little History
10 years is a long time: What on earth have we been doing over that period? Let's revisit some of the highlights.
The emergence of high-speed networks in the 1990s led to an awareness that the Internet could allow for more interesting applications than e-mail and file transfer. (Len Kleinrock had envisioned this possibility back in 1969, but it took a while to get there!) Efforts like the U.S. Gigabit testbed project, led by Bob Kahn, and the Supercomputing'95 I-WAY effort, led by Tom DeFanti and Rick Stevens, helped build awareness of these opportunities. This era also saw pioneering efforts such as the NSF Metacenter, led by Charlie Catlett and Larry Smarr, and Legion, led by Andrew Grimshaw. However, for the most part, every application was constructed from scratch.
We (in particular, myself, Carl and Steve Tuecke) studied this situation and saw a need for standards and software (middleware) to bridge the gap between applications and the complexities of a distributed resource environment. Thus, we started a research project aimed at defining this middleware. Believing strongly that we did not necessarily know the real problems, we started an iterative process of examining the requirements of collaborative communities, prototyping solutions to their problems and feeding back the resulting experiences into a next cycle of research and development. We called this project Globus because it built on earlier technology called "Nexus" and had global goals.
Back in 1996, our ambitions and the needs of our users were far greater than our resources -- a situation that persists today! -- and so it was challenging to develop software that was sufficiently stable and functional to allow for meaningful experiments. Fortunately, we found wonderful application partners -- people like Ed Seidel, Paul Messina and their colleagues, and later members of the high energy physics community -- who were prepared to work with often imperfect software and provide invaluable feedback.
Along the way, we achieved milestones that helped persuade ourselves and others that we had something useful. For example, 1998 saw Sharon Brunett, Karl Czajkowski and others achieve a record-setting military simulation involving 100,298 vehicles distributed over 13 supercomputers at nine sites. Gregor von Laszewski and others demonstrated real-time analysis of data from the Advanced Photon Source. At the SC'98 conference, we demonstrated the "Globus Ubiquitous Supercomputing Testbed Organization" (GUSTO) that spanned some 50 sites worldwide. NASA launched its Information Power Grid project, under the leadership of Bill Johnston.
By 2001, the year in which the TeraGrid was founded, we had software we felt was ready to operate in production environments, if only we could find friendly sites prepared to perform the needed integration, and application scientists ready to develop the necessary application software. In practice, we weren't as ready as we thought we were, but nevertheless we entered a stage -- of learning via experience about the mechanisms and policies required for operational use -- that to some extent continues today. We also received some nice recognition at this time: Globus Toolkit version 2 (GT2) played a key role in a Gordon Bell prize awarded at SC'01 to an astrophysics application that used Cactus, MPICH-G2 and Globus. The following year, R&D Magazine recognized GT2 with an R&D 100 award and named it the "most promising new technology" of the year.
In late 2001, IBM followed up its dramatic open source Linux strategy announcement with a similar announcement about the importance of Grid technologies. We were thrilled when IBM elected to work with us to develop the OGSI Web Services specification and the corresponding Globus implementation, which was released in 2003 as GT3. While this first Web services release provided only modest quality, it spurred much innovative work, such as the video distribution system developed by the Belfast eScience Center for the BBC (to give an idea of the scale of effort underway by this time, BeSC applications alone totaled 1.5 million lines of GT3 code, later adapted for GT4).
2005 saw the release of Globus Toolkit version 4 (GT4), which, thanks to the efforts of talented developers and the able leadership of Lisa Childers, exceeded all previous releases in terms of quality and rigor of both software and documentation. GT4 supports the construction of stateful and secure Web services in Java, C and Python; provides job submission, file transfer, credential management, registry and database access services; incorporates a powerful integrated security system; and provides many other features besides. 2005 and 2006 also saw significant new funding in support of the Globus science community, from the U.S. National Science Foundation's NSF Middleware Initiative (under Kevin Thompson), UK eScience program (for work on OGSA-DAI) and, most recently, from the U.S. Department of Energy's SciDAC program.
Where We Are Today
Someone once dismissed Grid as a "funding concept" -- a witty but irritating turn of phrase. I have not heard that expression lately: Grid is mainstream in both science and industry, and so many people are using Grid technology to solve real problems that it is hard to argue that it is not successful and useful. Indeed, we can make a strong case that Grid has had a significant impact on how people conceptualize and solve problems in many domains.
It is particularly pleasing to see the diversity of Globus application communities, which span, for example, astronomy (e.g., the LIGO gravitational wave observatory, the Caltech Montage service), bioinformatics (e.g., Natalia Maltsev's PUMA system), cancer biology (e.g., the National Institutes of Health's caBIG cancer bioinformatics Grid), data mining (e.g., work by Domenico Talia) and environmental science (e.g., C3grid in Germany and Earth System Grid in the United States). And that is just the first five letters of the alphabet.
I am also delighted with the geographical diversity of Globus deployments. We see substantial Globus deployments and applications in every continent except Antarctica, and just about every day I get e-mail from someone somewhere describing a new deployment of which I was not previously aware. Again, we can walk through the alphabet: Australia, Belgium, China (and Canada and Chile), Denmark, England, France, Germany, Hungary, Ireland, Japan, Korea, Luxembourg, Mexico, the Netherlands, ....
Another area in which we continue to see wonderful progress is in the range of "solutions" that leverage Globus software. Globus middleware does not address end-user requirements directly, but a wide range of Globus-based tools now existing for building portals (e.g., OGCE, GridPort, Jason Novotny and Michael Russell's GridSphere); executing workflows (e.g., Ewa Deelman and Mike Wilde's VDS, David Abramson's Nimrod, Miron Livny's Condor, BPEL); running parallel programs (e.g., Nick Karonis' MPICH-G2); delivering data (e.g., Ann Chervenak's DRS, Reagan Moore's SRB); operating instruments (e.g., Rick McMullen's Common Instrument Middleware Architecture project, GridCC in Europe); remote service invocation (e.g., Ninf in Japan); and so on. Lee Liming has done a nice job documenting these and other "solutions."
It is also pleasing to see the progress being made in industry. Steve Tuecke left Argonne in 2004 to form Univa Corp., which provides commercial support for Globus software and is building new products using Globus (disclaimer: I am also a Univa founder and advisor). They are discovering that the concerns of industry are increasingly similar to those of science, as the need to accelerate innovation processes leads to a need for dynamic resource sharing between organizational units.
I should also mention the progress made with standards. Globus contributors, notably Von Welch, played major roles in the Grid Security Infrastructure standard, which has been widely adopted. The same is true for GridFTP, under the leadership of Bill Allcock. The Job Submission Description Language (JSDL) and Basic Execution Servie (BES) specifications, which seem likely to see wide adoption, build heavily on GRAM. Globus project members, notably Frank Siebenlist, have also contributed heavily to the increasingly important WS-Security, SAML2 and XACML specifications.
It is a nice coincidence, given our anniversary, that August saw the release of the WS-ResourceTransfer specification by HP, IBM, Intel and Microsoft -- perhaps signaling the end of a standards odyssey that began in 2001 when Steve Tuecke and others defined the Open Grid Services Infrastructure (OGSI). The goal was to codify Web services mechanisms for representing and accessing state, a requirement that appeared in many different contexts. Like Ulysses, we did not know we were embarking on an Odyssey when we began. However, the release of WS-ResourceTransfer -- remarkably similar to OGSI! -- suggests that we may soon reach this journey's end.
Also worthy of celebration is the tremendous growth in the size of the Globus developer community. In the beginning, there were just three of us, plus a few partners such as Craig Lee at the Aerospace Corp. The team grew over time, as talented researchers and developers joined us at Argonne, the University of Chicago and USC Information Sciences Institute, and then other organizations partnered with us, notably the National Center for Supercomputing Applications (Jim Basney, Von Welch and others), the University of Edinburgh (Malcolm Atkinson, Neil Chue Hong, Mark Parsons and others) and PDC in Sweden (Olle Mulmo and others). Most recently, the new dev.globus development process (modeled after that of Apache Jakarta) has partitioned Globus into dozens of independent projects, each with its own developers, and opened the way for new projects to join. The response has been enthusiastic: under the leadership of Jennifer Schopf, our new incubator process already has 11 incubator projects up and running.
We have learned a tremendous amount in the past 10 years. It is hard to know where to start in terms of summarizing lessons learned, but here are a few thoughts.
We were clearly correct in identifying large-scale collaboration as an important problem, and in choosing science as a good place to start identifying requirements and experimenting with solutions. We have seen the need to federate data and computing, orchestrate the allocation of resources to different purposes and manage the policies that govern these activities become increasingly important, first across science and now in industry too. Indeed, these questions are arguably now central to the critical question of how innovation occurs within and across organizations.
For more information, contact ITTC.