Traleika Glacier

Traleika Glacier


Team Members	Intel, Reservoir Labs, ETI, UDEL, UC San Diego, Rice U., UIUC, PNNL
PI	Shekhar Borkar (Intel)
Co-PIs	Wilf Pinfold (Intel), Richard Lethin (Reservoir Labs), Rishi Khan (ETI), Guang Gao (UDEL), Laura Carrington (UC San Diego), Vivek Sarkar (Rice U.), David Padua (UIUC), Josep Torrellas (UIUC), John Feo (PNNL)
Website	https://sites.google.com/site/traleikaglacierxstack
Download	{{{download}}}

Team Members

Intel: Hardware guidance, HW/SW co-design, resiliency, technical management
Reservoir Labs: Programming system, R-Stream, tools, optimization
ET International (ETI): Simulators, execution model and runtime support
University of Delaware (UDEL): Execution model research
University of California, San Diego (UC San Diego): Applications
Rice University: Programming system, runtime system
University of Illinois at Urbana-Champaign (UIUC): Programming system, Hierarchical Tiles Arrays (HTA), architecture, system architecture evaluation
Pacific Northwest National Laboratory (PNNL): Kernels and proxy apps for evaluation
Sandia National Lab (SNL): Co-design lead, combustion proxy app

Goals and Objectives

Goal:

Research and mature software technologies addressing major Exascale challenges and get ready to intercept by 2018-2020

Objectives:

Energy efficiency: SW components interoperate, harmonize, exploit HW features, and optimize the system for energy efficiency
Data locality: PGM system & system SW optimize to reduce data movement
Scalability: SW components scalable, portable to O(109)—extreme parallelism
Programmability: New (Codelet) & legacy (MPI), with gentle slope for productivity
Execution model: Objective function based, dynamic, global system optimization
Self-awareness: Dynamically respond to changing conditions and demands
Resiliency: Asymptotically provide reliability of N-modular redundancy using HW/SW co-design; HW detection, SW correction

Publications

Intel

Romain Cledat, Sagnak Tasirlar (Rice University) and Rob Knauerhase (Intel), Programmer Obliviousness is Bliss: Ideas for Runtime-Managed Granularity. To be published at HotPar ’13, June 24, 2013, San Jose, CA - https://www.usenix.org/conference/hotpar13
Shekhar Borkar, How to stop interconnects from hindering the future of computing!, Optical interconnects Conference, May 2013
Shekhar Borkar, Exascale Computing—a fact or a fiction?, IPDPS, May 2013
Birds-of-a-Feather session at SuperComputing12, November 14, 2012. See the OCR homepage at https://01.org/projects/open-community-runtime.

University of Delaware

Joshua Suetterlein, Stephane Zuckerman, and Guang R. Gao, An Implementation of the Codelet Model. To be published in the proceedings of the 19th International European Conference on Parallel and Distributed Computing (EuroPar 2013), August 26-30, Aachen, Germany.
Chen Chen, Yao Wu, Stephane Zuckerman, and Guang R. Gao. Towards Memory-Load Balanced Fast Fourier Transformations in Fine-Gain Execution Models. To be published in Proceedings of 2013 Workshop on Multithreaded Architectures and Applications (MTAAP 2013). 27th IEEE International Parallel & Distributed Processing Symposium, May 24, Boston, MA, USA.
Aaron Myles Landwehr, Stephane Zuckerman, Guang R. Gao. Toward a Self-Aware System for Exascale Architectures. CAPSL Technical Memo 123, June 2013.

Rice University

Integrating Asynchronous Task Parallelism with MPI. Sanjay Chatterjee, Sağnak Taşırlar, Zoran Budimlić, Vincent Cavé, Millind Chabbi, Max Grossman, Yonghong Yan and Vivek Sarkar. 27th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013), May 2013, Boston, MA.
Compiler Optimization of an Application-specific Runtime, Kath Knobe and Zoran Budimlić, CPC 2013: 17th Workshop on Compilers for Parallel Computing, July 3-5, 2013, Lyon, France. (to appear).

University of California San Diego

Traleika Glacier X-Stack Overview, presented by Laura Carrington (UCSD) at the Fourth ExaCT All Hands Meeting, Sandia National Laboratories, May 14, 2013

Scope of the Project

Roadmap

Architecture

Straw-man System Architecture and Evaluation

Data-locality and BW Tapering, Why So Important?

Programming and Execution Models

Programming model

Separation of concerns: Domain specification & HW mapping
Express data locality with hierarchical tiling
Global, shared, non-coherent address space
Optimization and auto generation of codelets (HW specific)

Execution model

Dataflow inspired, tiny codelets (self contained)
Dynamic, event-driven scheduling, non-blocking
Dynamic decision to move computation to data
Observation based adaption (self-awareness)
Implemented in the runtime environment

Separation of concerns

User application, control, and resource management

Programming System Components

Runtime

Different runtimes target different aspects
- IRR: targeted for Intel Straw-man architecture
- SWARM: runtime for a wide range of parallel machines
- DAR3TS: explore codelet PXM using portable C++
- Habanero-C: interfaces IRR, tie-in to CnC

All explore related aspects of the codelet Program Exec Model (PXM)

Goal: Converge towards Open Collaborative Runtime (OCR)
- Enabling technology development for codelet execution
- Model systems, foster novel runtime systems research

Greater visibility through SW stack -> efficient computing
- Break OS/Runtime information firewall

Some Promising Results:

Runtime Research Agenda

Locality aware scheduling—heuristics for locality/E-efficiency
- Extensions to standard Habanero-C runtime

Adaptive boosting and idling of hardware
- Avoid energy expensive unsuccessful steals that perform no work
- Turbo mode for a core executing serial code
- Fine grain resource (including energy) management

Dynamic data-block movement
- Co-locate codelets and data
- Move codelets to data

Introspection and dynamic optimization
- Performance counters, sensors provide real time information
- Optimization of the system for user defined objective
- (Go beyond energy proportional computing)

Simulators and Tools

Simulators—what to expect and not

Evaluation of architecture features for PGM and EXE models
Relative comparison of performance, energy
Data movement patterns to memory and interconnect
Relative evaluation of resource management techniques

Results Using Simulators

Traleika Glacier

From Modelado Foundation

Revision as of 14:38, October 3, 2013 by imported>Jsstone1 (→Publications)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Contents

Team Members

Goals and Objectives