logo inria

Information de meme niveau :

| Summary document |

-----------------------
INRIA at the heart of Grid Computing research
September 2003
-----------------------

English version Version française

INRIA at the heart of grid research

Ensuring efficient communication between software components

Using computing grids, more and more scientists and engineers will be able to perform complex numerical simulations involving specific codes for several distinct phenomena that are coupled together. Think about simulating the behavior of a satellite in space, which must simultaneously take into account kinematic, thermal, mechanical and optical aspects. Such simulations will have to be based on parallel technology in order to meet performance requirements and on distributed technology to meet the needs in computer resources. This assumes that both parallel and distributed aspects are sufficiently compatible.
The PARIS team (Programming parallel and distributed systems for large scale numerical simulation) is working in this direction and is striving to develop a high performance software component model adapted to grids. The PARIS team is starting from the CORBA component model and extending it to develop a software component model that integrates parallel codes. This new system is called GridCCM (Grid CORBA Component Model). It must in particular guarantee efficient communication between parallel software components, which was not the case of previously existing models. PARIS thus designed a communication management “framework” called PadicoTM. This framework can be used to make parallel distributed services work without conflicts without having to modify the applications. PadicoTM has been on display at Supercomputing 2002. It is intended for code coupling applications based on the parallel COBRA objects concept. Even though COBRA is generally regarded as rather slow, PadicoTM allows communication at up to 240 MB/s and latency times of 20µs. This level of performance is comparable to that of MPI (Message-Passing Interface, a message exchange programming executable in parallel programming). Still in the context of code coupling, PARIS is also studying the problem of data globalization. The problem is to store the data needed or produced by the simulation in a distributed fashion over the grid.

Guaranteeing rapid access to data, coupling computing codes

In a grid, one of the bottlenecks can be the speed of access to data files, whose size will certainly be huge. Part of the work of the APACHE team (Parallel algorithmics, programming and load sharing) is on this aspect. APACHE has been interested in parallel architecture programming for several years. The team is now developing parallelization-thus acceleration-methods for data access,
through PC clusters. It is also designing methods to couple together computing codes executed on different nodes of the grid in such a way that the various tasks carried out by these codes follow on optimally. APACHE has developed its programming techniques using a 200 i-Vectra PCs cluster supplied by Hewlett-Packard. This cluster ranked 385th in the TOP500 of the most powerful machines worldwide in June 2001. On these subjects, Apache has tight collaborations with companies such as Bull, HP, Mandrake, CS, etc. The team is validating its techniques on concrete applications such as modeling the crossing of cellular membranes by proteins, in collaboration with chemists, biologists and mathematicians.

Optimally distributing tasks and data for computation

One of the problems tackled by computer scientists is the design of an appropriate algorithmics for computing on a grid. The various tasks in a given computation must be ordered and the data must be placed in such a way that the computation executes optimally in terms of the hardware configuration and the current state of the grid. This question is part of the ReMaP team concerns (Regularity and Massive Parallelism). The team is looking for heuristic methods for task scheduling and data placement, that are validated via simulation. The simulation is done using the SIMGRID simulator designed and developed by the University of California at San Diego, with the participation of ReMaP. Project ReMaP is also developing software layers to let a client of the grid export certain parts of a computation to be done to other grid nodes or servers. Such software layers must for example choose the most appropriate server at any given instant, i.e., the less loaded, the fastest, or the most adapted server to the task in question. They must also choose the data that must be supplied to the servers. ReMaP relies on CORBA (Common Object Request Broker Architecture), a software system that is now standard to link applications implemented on heterogeneous platforms) to develop their toolbox called DIET (Distributed Interactive Engineering Toolbox). DIET will be demonstrated at Supercomputing 2002 on an Ethernet network connecting 6 laptops. The toolbox has been developed with support from the RNTL and is being tested on various applications (digital terrain models, simulation of electronic circuits) in the framework of project ASP of the GRID concerted initiative.

A Java program library for distributed parallel computing

Part of the activities of project OASIS (Active Objects, Semantics, Internet and Security) has similar objectives and is concerned with programming tools for distributed applications, either on a local Intranet network, on a workstation cluster or on Internet grids. The team is developing in particular a program library entirely written in Java for distributed parallel computing in the framework of the ObjectWeb consortium founded by France Télécom R&D, Bull and INRIA. This library, called ProActive, can be used to perform mobile computations, that is to say computations initiated on one machine of the grid and continued on another machine. It also includes security tools such as data exchange encryption and user authentification. ProActive has many more attractive features: dynamic and transparent code loading, online documentation, ease of installation and use, visualization and graphic control of program execution. ProActive will be demonstrated at Supercomputing 2002. OASIS is validating its ProActive development on electromagnetism computations (aircraft radar image computations). The ProActive library is available on the Internet under LGPL license. It has already been downloaded by numerous academic and industry users.
The OASIS team very recently succeeded in executing an application to solve 3D Maxwell's equations in electromagnetism on a 64 processor cluster, showing an practically optimal acceleration. The application was developed in collaboration with project CAIMAN and entirely written in ProActive Java. Executing the same application in P2P Intranet on desktop machines and standard INRIA production network made it possible to solve a 150 to the cube mesh, that is to say over 100 million facets, on 252 processors.

Adapting communication protocols to heterogeneous infrastructures and very high speed

Network connections, a central element of the future grids, are capable today of delivering considerable throughputs. The American network TeraGrid that links the National Science Foundation computing centers reaches 40 gigabits per second and the European network GEANT that connects the main European capitals reaches 10 gigabits per second. It is however crucial that such capabilities be used to their maximum. This entails the design of communication protocols and performance measurement and prediction tools that are adapted to very high speeds and heterogeneous infrastructure. This is one of the main tasks of the RESO team (Protocols and Software Optimized for Heterogeneous High Speed Networks). The IP and TCP protocols used on the Internet are already old and are neither adapted to very high speed nor to the grid concept. RESO is studying the possible evolution of these protocols. One of the developments proposed by the team consists in introducing a differentiation of service, that is to say in modulating the degree of priority of information packets, and optimized transport protocols. The idea is to allow the very heavy flows attached to a grid to take advantage of the periods when regular Internet traffic is low, in order to fully exploit the capabilities offered.
RESO is carrying out this work in collaboration with national projects (the VTHD network of the RNRT, the e-Toile platform of the RNTL) and the international projects DataGrid and DataTAG. DataGrid is a European project that aims at designing a platform and software for a data grid at the service of particle physics, Earth monitoring and biology. It relies on European networks such as GEANT or national networks like RENATER. The DataTAG project (Data TransAtlantic Grid) objective is to interconnect European and American grids via very high speed links.

I-Cluster 2: a shared experimental platform

INRIA teams carry out their research activities using experimental platforms. I-Cluster 2, currently being installed in the INRIA Rhône-Alpes research unit, provides the institute with its most powerful supercomputer yet. Its architecture is based on Itanium 2 dual processors communicating through a Myrinet network. A total of 104 dual-processors at 900MHz, 312 Go RAM, are arranged as 10 racks of 10 nodes and 1 rack of 4 nodes with additional disk storage. I-cluster 2 is connected to the VTHD network and is running Linux OS (RedHat Advanced Server). First Linpack experiments at INRIA (Aug. 2003) have reached a 560 GFlop/s performance.
I-cluster 2 is part of a scientific program financed by the French ministry of Research and Education, the Rhône-Alpes region, INRIA, Ecole Normale Supérieure de Lyon, the Institut National Polytechnique of Grenoble and Joseph Fourier University.

Other INRIA projects

Other INRIA teams are carrying out research that concern data or computing grids one way or another. Thus, the ARES team (Architecture for Service Networks) that is working on problems related to service deployment on radio network infrastructures, is developing a platform called DARTS (Deployment and Administration of Resources, Processing and Services). In the framework of grids, this platform can be used for administering and instrumenting the computing resources available on different points of the grid. It also supplies services to simplify the interaction between the various components administered (asynchronous messaging, naming, application dynamic loading). The platform also offers facilities for application administration (deployment, porting).
Researchers in project SARDES (Constructing Software Infrastructures for Large Scale, Heterogeneous, Distributed Systems) are studying the architecture and design of distributed software infrastructures for global information processing environments, by systematically using reflection and component building techniques (a reflexive system can be defined as a system offering an explicit, operable and causally connected representation of itself).

Project CARAVEL (Information Mediation Systems) is concerned with the problem of integrating information in networks that contain heterogeneous, autonomous information sources, as is the case for grids. The question is to offer a uniform mode of access to a set of information sources through an integrated view, to facilitate the construction and maintenance of coherent data warehouse and to offer modes of navigation in an information network that is adapted to different categories of users.
Finally, project ScAlApplix (High Performance Schemes and Algorithms for Complex Scientific Applications) brings together several scientific skills for a multidisciplinary study of high performance computing and its applications to complex scientific computations-chemical reactions simulations, unsteady fluid flows simulations, host-parasite systems simulations and so on-that require massive computing power of the order of the teraflops and very large volumes of data of the order of a terabyte. In addition to modeling and simulation techniques and high performance algorithms, project ScAlApplix is also working on the visualization and steering of distributed numerical simulations using a virtual reality code.

 

bas de page
back to tophome page
© INRIA - updated 23/09/2003 - webmaster@inria.fr