OS/R Program Description

From Modelado Foundation

OS/R Program Motivations

The operating system and runtime system are critical components of the software stack for extreme-scale systems and fundamental advancements are needed within these components to address several challenges facing applications:

  • Lightweight message and thread management: In order to hide latency and support dynamic programming environments, low-level message handling and lightweight thread activation must be co-optimized. New techniques are needed to handle lightweight and resilient message layers; scalable message-driven thread activation and fine-grained active messages; global address spaces; extremely large thread counts; buffer management, collective operations, and fast parallel reductions; thread scheduling and placement; and improved quality-of-service (QoS) and prioritization.
  • Holistic power management: Extreme-scale systems will manage power and energy as a first-class resource across all layers of software as a crosscutting concern. Novel techniques are needed for whole-system monitoring and dynamic optimization; trading of energy for resilience or time to solution; power-aware scheduling and usage forecasts; goal-based feedback and control strategies; coscheduling; and adaptive power management of storage, computing, and bandwidth.
  • Resilience: Extreme-scale OS/Rs must support scalable mechanisms to predict, detect, inform, and isolate faults at all levels in the system. Therefore, resilience is a crosscutting concern. The OS/R must be resilient and support an array of low-level services to enable resilience in other software components, from the HPC application to the storage system. Innovative concepts to support multilevel, pluggable collection and response services are needed.
  • OS/R architecture: Extreme-scale systems need agile and dynamic node OS/Rs. New designs are needed for the node OS to support heterogeneous multicore, processor-in-memory, and HPC-customized hardware; I/O forwarding; autonomic fault response mechanisms; dynamic goal-oriented performance tuning; QoS management across thread groups, I/O, and messaging; support for fine-grained work tasks; and efficient mechanisms to support coexecution of compute and in situ analysis.
  • Memory: Deep hierarchies, fixed power budgets, in situ analysis, and several levels of solid-state memory will dramatically change memory management, data movement, and caching in extreme-scale OS/Rs. Clearly needed are novel designs for lightweight structures to support allocating, managing, moving, and placing objects in memory; methods to dynamically adapt thread affinity; techniques to manage memory reliability; and mechanisms for sharing and protecting data across colocated, coscheduled processes.
  • Global OS/R: Extreme-scale platforms must be run as whole systems, managing dynamic resources with a global view. New concepts and implementations are needed to enable collective tuning of dynamic groups of interacting resources; scalable infrastructure for collecting, analyzing, and responding to whole-system data such as fault events, power consumption, and performance; reusable and scalable publish/subscribe infrastructures; distributed and resilient RAS (reliability, availability, and serviceability) subsystems; feedback loops for tuning and optimization; and dynamic power management.

OS/R Program Goals

The scale and complexity of exascale platforms requires a radically new structure for operating systems and runtime software. Traditional OS/R functions and interfaces will need to be completely redesigned in order to incorporate breakthroughs in energy, power, parallelism, resilience, and management of memory and storage hierarchies. Radically new functions and interfaces will be required for system control, management of resources, communications, thread management, synchronization, power management, recovery from faults, initialization, configuration, monitoring, load-balancing and for migration of work. OS/R software architecture will be very different from today’s architectures, requiring coordinating dynamically adaptive services distributed across multiple layers of OS/R software components. Projects under the OS/R program are expected to address these concerns and to lead to platform-neutral prototypes of a complete exascale operating system and runtime system. These projects are expected to address the following capabilities:

  • Power management: In order to keep exascale machines within a tight power budget of 20MW, we will need unprecedented techniques for global system monitoring, dynamic co-optimization of energy, resilience, and performance and goal-based control strategies for adaptive power management of system resources.
  • Support for dynamic programming environments: Managing billions of threads executing in the complex, energy constrained, heterogeneous, and energy-aware platforms will require new dynamic and adaptive mechanisms for thread activation, scheduling, placement, and message layers.
  • Programmability and tuning support: Exascale systems require low level mechanisms that support dynamic adaptation, debugging, and joint performance-power optimization. This will require innovation in extremely lightweight thread and task-based mechanisms that support correctness, debugging, and performance analysis tools as well as autonomic, real-time performance, power, and resilience tuning.
  • Resilience: Extremely high expected rates of error/faults are expected to impact the entire exascale system stack, from the hardware to the application level. Significant responsibility for resilience reside with the OS/R layer, demanding innovative, scalable mechanisms that independently and concurrently predict, detect, contain, and recover from faults at the different components of the system (nodes, group of nodes, domains, partitions, enclaves, etc.)
  • Heterogeneity: Handling the heterogeneous, hierarchical processor and memory systems of exascale platforms will require radically new OS/R mechanisms capable of dynamic goal-oriented management of heterogeneous resources, each with different power-performance- resilience characteristics.
  • Memory management: Achieving exascale requires the use of new, energy efficient memory technologies. Using these new memory systems requires lightweight OS/R mechanisms that support innovative, dynamic allocation management, movement, and placement of objects in memory; techniques to dynamically manage memory reliability; and adaptable mechanisms for sharing and protecting data across co-located, co-scheduled processes.
  • Global optimization: All execution components of an exascale platform must coordinate and execute as one whole system managing dynamic resources with a global view. This will require new strategies to enable the management of dynamic groups of interacting resources; scalable mechanisms for collecting, analyzing, and managing data globally; and global feedback loops for dynamic power management and performance optimization.