MPI for Exascale

Evolving MPI to Address the Challenges of Exascale Systems

Rajeev Thakur, Pavan Balaji, Marc Snir, Ewing Lusk
Mathematics and Computer Science Division
Argonne National Laboratory
Email: {thakur, balaji, snir, lusk} @mcs.anl.gov

Project Goals

The vast majority of DOE’s parallel scientific applications running on the largest HPC systems are written in a distributed-memory style using MPI as the standard interface for communication between processes. These application codes represent billions of dollars worth of investment. As we transition from today’s petascale systems to exascale systems by the end of this decade, it is not clear what will be the right programming model for the future. However, until a viable alternative to MPI is available, and large DOE application codes have been ported to the new model, MPI must evolve to run as efficiently as possible on future systems. This situation requires that both the MPI standard and MPI implementations address the challenges posed by the architectural features, limitations, and constraints expected in future post-petascale and exascale systems.

The most critical issue is likely to be interoperability with intranode programming models with a high thread count. This requirement has implications both for the definition of the MPI standard itself (being considered now in the MPI Forum, in which we are major participants) and for MPI implementations. Other important issues, also impacting both the standard and its implementation, include scalability, performance, enhanced functionality based on application experience, and topics that become more significant as we move to the next generation of HPC architectures: memory utilization, power consumption, and resilience.

Our group at Argonne has been a leader in MPI from the beginning, including the MPI standardization effort; research into implementing MPI efficiently that has resulted in a large number of publications; and development of a high-performance, production-quality MPI implementation (MPICH) that has been adopted by leading vendors (IBM, Cray, Intel, Microsoft) and runs on most of the largest machines in the world. This project continues the ongoing MPI-related research and development work at Argonne, with the overall goal of enabling MPI to run effectively at exascale. Specific goals of this project fall into three categories:

Continued enhancement of the MPI standard through the MPI Forum by leading several of its subcommittees to ensure that the standard evolves to meet the needs of future systems and also of applications, libraries, and higher-level languages.
Continued enhancement of the MPICH implementation of MPI to support the new features in future versions of the MPI standard (MPI-3 and beyond) and to address the specific challenges posed by exascale architectures, such as lower memory per core, higher thread concurrency, lower power consumption, scalability, and resilience.
Investigation of new programming approaches to be potentially included in future versions of the MPI standard, including generalized user-defined callbacks, lightweight tasking, and extensions for heterogeneous computing systems and accelerators.

We have close ties with various DOE applications that are targeted to scale to exascale, including the exascale codesign centers. We will work with these applications, particularly the mini-apps and skeleton codes from the codesign centers, to study the effectiveness of our MPI implementation and of the new features in the MPI standard. We will also continue our collaboration with vendors, particularly IBM, Cray, and Intel, to codesign MPICH such that it remains the leading implementation running on the fastest machines in the world.