Revision as of 16:31, March 15, 2017

The propose of this page is to gather user applications that serve as poster children for HHAT.

Please this this approach

Create a new subsection for each application, with two equal signs and a space around the title of each app
Include the content in the template below

CORAL apps

Collaboration of Oak Ridge, Argonne and Livermore

APEX apps

Alliance for Application Performance at Extreme Scale

ECP apps

Exascale computing project

PASC apps

Platform for Advanced Scientific Computing, Switzerland

Shortly there will be more information about this initiative, but generally speaking Switzerland is founding development projects for libraries and applications with strong performance-portable code in areas like weather and climate, material science and molecular dynamics. Many computational kernels there are based on linear algebra, some other are based on finite differences.

GridTools

One of the projects is producing a set of (C++ header-based) libraries for finite differences in weather and climate applications. The main idea is to allow for the numeric operators to be expressed in a possibly grid-agnostic way, while the grid, wether representing a local region or the whole globe, is plugged in a second step. The libraries provide means of composition for the different operators , so that to allow to increase the computation-intensity of otherwise memory bound stencils, allowing specification of boundary conditions in a very flexible way, perform nearest neighbor communication operations, and domain decomposition. The central component of the set of libraries is the composition of different operators. All of the libraries, however, have backends to execute the requested tasks on specific architectures. Currently supported are x86-based multicores and nVidia GPUs. Xeon Phi is in a early stage of implementation. A plan to orchestrate the different activities (stencil execution, boundary conditions, communications) using some for of dynamic scheduling is one of the goals we are pursuing. Employing a more dynamic execution policy for each computational phase is not currently considered a urgent matter, since the scheduling of the operations is basically known at compile time. Future directions may include adopting a more dynamic approach in both high- and low-levels if such an integration is beneficial for performance.

ISV apps

Sandia's Task-DAG R&D 2014-2016

Sandia's Task-DAG LDRD report

Sandia conducted a three year laboratory directed research and development (LDRD) effort to explore on-node, performance portable directed acyclic graph (DAG) of tasks parallel pattern, usage algorithms, application programmer interface, scheduling algorithms, and implementations. Of significance this LDRD used C++ meta-programming to achieve performance portability across CPU and NVIDIA GPU (CUDA) architectures. The above document is the final report for this R&D.

The prototype developed through this LDRD is currently (2017) being matured (overhauled) to address performance issues and elevate to production quality. This effort is scheduled for delivery within Kokkos by September 2017.

TRALEIKA GLACIER X-STACK Project

Final Technical Report

The XStack Traleika Glacier (XSTG) project was a three-year research award for exploring a revolutionary exascaleclass machine software framework. The XSTG program, including Intel, UC San Diego, Pacific Northwest National Lab, UIUC, Rice University, Reservoir Labs, ET International, and U. Delaware, had major accomplishments, insights, and products resulting from this three-year effort.

Its technical artifacts were primarily 1) a novel hardware architecture (Traleika Glacier) and a simulator for this architecture, 2) a specification of a DAG parallel, asynchronous tasking, low-level runtime called the Open Community Runtime (OCR), 3) several implementations of OCR including a reference implementation and the PNNL optimized OCR implementation (P-OCR), 4) the layering of several higher level programming models on top of OCR including CnC, HClib, and HTA, and 5) implementation of several DoE mini-apps and other applications on top of OCR or the higher level programming models.

Apps included:

Smith Waterman
Cholesky decomposition
Two NWChem kernels (Self-Consistent Field and Coupled Cluster methods)
CoMD
HPGMG

Habanero Tasking Micro-Benchmark Suite

Github page

This micro-benchmarking suite is a work-in-progress intended to compare low-level overheads across low-level tasking runtimes (e.g. Realm, OCR). The above Github page includes a high-level description of each micro-benchmark, as well as source code for each micro-benchmark across a variety of low-level runtimes. These micro-benchmarks were curated across performance regression suites from a variety of tasking runtimes, and so is intended to enable one-to-one comparison of runtime efficiencies (as much as possible).

Categories of Hierarchical Algorithms

David Keyes volunteered to offer several such categories.

Add more

Template

Application 1

Brief description of app and its business importance
Brief description of app domain
Qualitative or quantitative analysis of where and how it would benefit from HHAT
Expected time table for delivery of a solution (e.g. readiness for the arrival of a new supercomputer at a USG lab), and resources available to implement it with HHAT
purpose: identify apps that could lead vehicles that drive the development of an open source project and that would be a poster child that would build confidence for others to follow

@@ Line 13: / Line 13: @@
 = PASC apps =
 Platform for Advanced Scientific Computing, Switzerland
+Shortly there will be more information about this initiative, but generally speaking Switzerland is founding development projects for libraries and applications with strong performance-portable code in areas like weather and climate, material science and molecular dynamics. Many computational kernels there are based on linear algebra, some other are based on finite differences.
+=== GridTools ===
+One of the projects is producing a set of (C++ header-based) libraries for finite differences in weather and climate applications. The main idea is to allow for the numeric operators to be expressed in a possibly grid-agnostic way, while the grid, wether representing a local region or the whole globe, is plugged in a second step. The libraries provide means of composition for the different operators , so that to allow to increase the computation-intensity of otherwise memory bound stencils, allowing specification of boundary conditions in a very flexible way, perform nearest neighbor communication operations, and domain decomposition. The central component of the set of libraries is the '''composition''' of different operators. All of the libraries, however, have '''backends''' to execute the requested tasks on specific architectures. Currently supported are x86-based multicores and nVidia GPUs. Xeon Phi is in a early stage of implementation. A plan to orchestrate the different activities (stencil execution, boundary conditions, communications) using some for of dynamic scheduling is one of the goals we are pursuing. Employing a more dynamic execution policy for each computational phase is not currently considered a urgent matter, since the scheduling of the operations is basically known at compile time. Future directions may include adopting a more dynamic approach in both high- and low-levels if such an integration is beneficial for performance.
 = ISV apps =

Applications: Difference between revisions

From Modelado Foundation