Actions

Open Community Runtime

From Modelado Foundation

Revision as of 07:38, November 20, 2014 by imported>VivekSarkar

Goal

The goal of the Open Community Runtime (OCR) project is to propose, implement, and evaluate a runtime framework and API that:

  • Is representative of future execution models
  • Can express large amounts of parallelism in a task-based model
  • Can explicitly capture logical dependences and data movements
  • Can be targeted by multiple high-level programming systems
  • Can be mapped efficiently on to future extreme scale platforms
  • Is available as an open-source testbed

Audience

In its current state, the project aims to release a prototype implementation of a reference API by September 2015, while making work-in-pogress transparently visible at the github location mentioned below. As such, the project is mostly geared towards early adopters among application developers who would like to provide feedback on the runtime model and API, higher-level language/models implementers who are interested in determining if their model can map to OCR, hardware developers who would like to experiment with OCR as a proxy for an exascale runtime, and runtime developers who are interested in using and/or contributing to OCR.

Value of the OCR project

The OCR project is creating an application building framework that explores new methods of high-core-count programming with an initial focus on HPC applications. The project aims to explore, among other things:

  • Expressiveness: how can application programmers express their applications in a hardware-agnostic manner; i.e.: how can they express the intrinsic parallelism and locality in an application as opposed to the mapping on to a particular kind of hardware?
  • Scheduling and data placement: what tuning hints and adaptive heuristics can enable locality-aware scheduling and data-placement on an exascale system?
  • Introspection: how can a runtime system improve its execution characteristics by monitoring itself?
  • Resiliency: how can a runtime system deal with failures in an exascale system?

The OCR project aims to propose a low-level API to address the challenges of exascale programming and is meant to be targeted by higher-level abstractions. Early implementations of higher-level abstractions that target OCR include the CnC programming model, the Habanero-C library, and the Habanero-UPC++ library.

Timeline

The OCR software is available under the BSD open source license.

  • Initial unveiling at SC 2012
  • v0.8 was introduced at SC 2013 with a significant rewrite to increase modularity and enable more community participation
  • v0.9 will be presented at SC 2014 with several updates (including support for execution on distributed-memory clusters)

Links

  • The source code is available on GitHub
  • Mailing lists are available here

Install Instructions for Distributed-OCR

The 'sc14' github branch contains the most recent OCR implementation publicly available. Things will be streamlined along the way.

Checking out the code

git clone https://github.com/01org/ocr.git
cd ocr
git branch sc14 --track origin/sc14
git checkout sc14

Environment setup

From the checked-out ocr folder setup the following environment variable.

export OCR_INSTALL_ROOT=$PWD/install
export OCR_SRC=$PWD
export OCR_BUILD_ROOT=$PWD/build
export APPS_ROOT=$PWD/ocr-apps

Compiling and Running Applications

The ocr-apps folder contains various applications. Those applications relies on pre-defined makefiles to ease compiling and linking against the OCR interface.

The general pattern to get an OCR application to run is as follow:

  • Select a target platform
  • Invoke the corresponding Makefile to build the application
  • Select an OCR configuration file
  • Invoke the corresponding Makefile to run the application

For instance, the npbCG application defines Makefiles for the following targets:

  • x86-pthread-x86: for shared-memory OCR
  • x86-pthread-mpi: for distributed-memory OCR relying on MPI for communications.

Hence to build npbCG for distributed, one executes the following command:

make -f Makefile.x86-pthread-mpi prerun install

Note that the Makefile checks if the runtime needs to be built too.

Once the application is built for distributed a configuration file must be provided. OCR relies on text-based configuration files to setup runtime instances. Some sample configuration files are available under machine-configs/

export OCR_CONFIG=/path/to/configfile

Additionally, one can point to file containing hostname to use for the the distributed run

export OCR_NODEFILE=/path/to/nodefile

Running the application can be done through the Makefile system as well:

make -f Makefile.x86-pthread-mpi run WORKLOAD_ARGS"-t B -b 1000"