X-ARCC: Difference between revisions
From Modelado Foundation
imported>Shofmeyr No edit summary |
imported>Shofmeyr No edit summary |
||
Line 8: | Line 8: | ||
| pi = Steven Hofmeyr (LBNL) | | pi = Steven Hofmeyr (LBNL) | ||
| co-pi = John Kubiatowicz (UCB)}} | | co-pi = John Kubiatowicz (UCB)}} | ||
We are exploring new approaches to Operating System (OS) design for | |||
exascale using ''Adaptive Resource-Centric Computing'' (ARCC). The | |||
fundamental basis of ARCC is dynamic resource allocation for adaptive assignment | |||
of resources to applications, combined with Quality-of-Service (QoS) enforcement | |||
to prevent interference between components. We have embodied ARCC in | |||
[http://tessellation.cs.berkeley.edu/publications/pdf/tess-paper-DAC-2013.pdf Tessellation], an OS designed for multicore nodes. In this project, our goal is to explore the potential for ARCC to address issues in exascale systems, by extending Tessellation with new features for multiple nodes. | |||
This requires addressing aspects such as multinode cell synchronization, distributed resource | |||
accounting, and topology-aware resource control. Rather than emphasizing component development for an exascale OS, we are focusing our efforts on high-risk, high-reward topics related to novel OS mechanisms and designs. | |||
There are several aspects we are exploring in the context of a multinode | |||
Tessellation: | |||
* What OS support is needed for new global | |||
address space programming models and task-based parallel programming models? To explore this, we are porting UPC and Habanero to run on multinode Tessellation, using GASNet as the | |||
underlying communication layer. Our test-cases for these runtimes on | |||
Tessellation are a subset of the Co-design proxy apps, which we use as | |||
representatives of potential exascale applications. | |||
* How should the OS support advanced memory management, including mechanisms for user-level paging, | |||
locality-aware memory allocation and multicell shared memory? | |||
* How do we extend hierarchical adaptive resource allocation and control across multiple | |||
nodes, including heterogeneous nodes such as the Intel Mic? | |||
* How should the OS manage the trade-off between power and performance optimizations? Will the Tessellation approach of treating both power and other resources (cores, memory) as first class | |||
citizens within the adaptive loop be adequate? | |||
* What OS abstractions are needed for the durable QoS-guaranteed storage that is essential to resilience? | |||
Line 18: | Line 43: | ||
[http://crd.lbl.gov/about/staff/cds/ftg/eric-roman/ Eric Roman (LBNL)] | [http://crd.lbl.gov/about/staff/cds/ftg/eric-roman/ Eric Roman (LBNL)] | ||
Revision as of 03:43, December 8, 2013
X-ARCC: Exascale Adaptive Resource Centric Computing with Tesselation | |
---|---|
Team Members | |
PI | Steven Hofmeyr (LBNL) |
Chief Scientist | {{{chief-scientist}}} |
Co-PIs | John Kubiatowicz (UCB) |
Website | http://tessellation.cs.berkeley.edu |
We are exploring new approaches to Operating System (OS) design for exascale using Adaptive Resource-Centric Computing (ARCC). The fundamental basis of ARCC is dynamic resource allocation for adaptive assignment of resources to applications, combined with Quality-of-Service (QoS) enforcement to prevent interference between components. We have embodied ARCC in Tessellation, an OS designed for multicore nodes. In this project, our goal is to explore the potential for ARCC to address issues in exascale systems, by extending Tessellation with new features for multiple nodes. This requires addressing aspects such as multinode cell synchronization, distributed resource accounting, and topology-aware resource control. Rather than emphasizing component development for an exascale OS, we are focusing our efforts on high-risk, high-reward topics related to novel OS mechanisms and designs.
There are several aspects we are exploring in the context of a multinode Tessellation:
- What OS support is needed for new global
address space programming models and task-based parallel programming models? To explore this, we are porting UPC and Habanero to run on multinode Tessellation, using GASNet as the underlying communication layer. Our test-cases for these runtimes on Tessellation are a subset of the Co-design proxy apps, which we use as representatives of potential exascale applications.
- How should the OS support advanced memory management, including mechanisms for user-level paging,
locality-aware memory allocation and multicell shared memory?
- How do we extend hierarchical adaptive resource allocation and control across multiple
nodes, including heterogeneous nodes such as the Intel Mic?
- How should the OS manage the trade-off between power and performance optimizations? Will the Tessellation approach of treating both power and other resources (cores, memory) as first class
citizens within the adaptive loop be adequate?
- What OS abstractions are needed for the durable QoS-guaranteed storage that is essential to resilience?