Actions

X-ARCC

From Modelado Foundation

Revision as of 03:49, December 8, 2013 by imported>Shofmeyr
X-ARCC: Exascale Adaptive Resource Centric Computing with Tesselation
Xarcc.png
PI Steven Hofmeyr (LBNL)
Chief Scientist {{{chief-scientist}}}
Website http://tessellation.cs.berkeley.edu

We are exploring new approaches to Operating System (OS) design for exascale using Adaptive Resource-Centric Computing (ARCC). The fundamental basis of ARCC is dynamic resource allocation for adaptive assignment of resources to applications, combined with Quality-of-Service (QoS) enforcement to prevent interference between components. We have embodied ARCC in Tessellation, an OS designed for multicore nodes. In this project, our goal is to explore the potential for ARCC to address issues in exascale systems, by extending Tessellation with new features for multiple nodes. This requires addressing aspects such as multinode cell synchronization, distributed resource accounting, and topology-aware resource control. Rather than emphasizing component development for an exascale OS, we are focusing our efforts on high-risk, high-reward topics related to novel OS mechanisms and designs.

There are several aspects we are exploring in the context of a multinode Tessellation:

  • What OS support is needed for new global address space programming models and task-based parallel programming models? To explore this, we are porting UPC and Habanero to run on multinode Tessellation, using GASNet as the underlying communication layer. Our test-cases for these runtimes on Tessellation are a subset of the Co-design proxy apps, which we use as representatives of potential exascale applications.
  • How should the OS support advanced memory management, including mechanisms for user-level paging, locality-aware memory allocation and multicell shared memory?
  • How do we extend hierarchical adaptive resource allocation and control across multiple nodes, including heterogeneous nodes such as the Intel Mic?
  • How should the OS manage the trade-off between power and performance optimizations? Will the Tessellation approach of treating both power and other resources (cores, memory) as first class citizens within the adaptive loop be adequate?
  • What OS abstractions are needed for the durable QoS-guaranteed storage that is essential to resilience?


Team Members

Steven Hofmeyr (LBNL)

John Kubiatowicz (UCB)

Eric Roman (LBNL)