Actions

SLEEC: Difference between revisions

From Modelado Foundation

imported>Milind
No edit summary
 
(5 intermediate revisions by 2 users not shown)
Line 4: Line 4:
| imagecaption =  
| imagecaption =  
| team-members = [http://www.purdue.edu/ Purdue U.], [http://www.sandia.gov/ SNL]
| team-members = [http://www.purdue.edu/ Purdue U.], [http://www.sandia.gov/ SNL]
| pi = Milind Kulkarni (Purdue)
| pi = [[Milind Kulkarni]]
| co-pi = Arun Prakash (Purdue), Michael Parks (SNL)
| co-pi = Arun Prakash (Purdue), Michael Parks (SNL)
| website = https://engineering.purdue.edu/SLEEC
| website = https://engineering.purdue.edu/SLEEC
Line 77: Line 77:
* Needed: ''a generic infrastructure for incorporating domain knowledge''
* Needed: ''a generic infrastructure for incorporating domain knowledge''


== Project Impact ==
* [https://xstackwiki.modelado.org/images/b/b9/SLEEC-Highlight_summary.pdf SLEEC Project Impact]


== Principles ==
== Principles ==
Line 214: Line 217:
=== Software Releases ===
=== Software Releases ===


* '''SemCache''' Runtime system for managing communication between CPUs and GPUs. Integration with Trilinos/Kokkos in progress.
* '''SemCache''' Will provide annotated, concrete, domain-specific libraries that use SLEEC technology to automatically manage communication between CPU and GPU for codes using Trilinos/Kokkos. Integration in progress.


= Presentations/Papers =
=== Presentations/Papers ===


* "Exploiting Domain Knowledge to Optimize Parallel Computational Mechanics Codes." ICS 2013 [https://engineering.purdue.edu/~milind/docs/ics13b.pdf (PDF)]
* "Exploiting Domain Knowledge to Optimize Parallel Computational Mechanics Codes." ICS 2013 [https://engineering.purdue.edu/~milind/docs/ics13b.pdf (PDF)]
Line 222: Line 225:
* SLEEC 2014 Progress Presentation [https://engineering.purdue.edu/SLEEC/SLEEC-2014.pdf (PDF)]
* SLEEC 2014 Progress Presentation [https://engineering.purdue.edu/SLEEC/SLEEC-2014.pdf (PDF)]
* "SemCache++: Semantics-aware Caching for Efficient Multi-GPU Offloading." ICS 2015. [https://engineering.purdue.edu/~milind/docs/ics15.pdf (PDF)]
* "SemCache++: Semantics-aware Caching for Efficient Multi-GPU Offloading." ICS 2015. [https://engineering.purdue.edu/~milind/docs/ics15.pdf (PDF)]
* "Exploiting Domain Knowledge to Optimize Mesh Partitioning for Multiscale Methods." Supercomputing 2015 (Poster). [https://engineering.purdue.edu/~milind/docs/sc15.pdf (PDF)]
* "Optimizing the LULESH Stencil Code using Concurrent Collections." WolfHPC 2015. [https://engineering.purdue.edu/~milind/docs/wolfhpc15.pdf (PDF)]

Latest revision as of 05:03, July 10, 2023

SLEEC
SLEEC-Logos.png
Team Members Purdue U., SNL
PI Milind Kulkarni
Co-PIs Arun Prakash (Purdue), Michael Parks (SNL)
Website https://engineering.purdue.edu/SLEEC
Download {{{download}}}

Semantics-rich Libraries for Effective Exascale Computation or SLEEC

Team Members


Motivations

  • Modern computational science applications composed of many different libraries
  • Computational libraries, communication libraries, data structure libraries, etc.
  • Peridigm, developed by co-PI Mike Parks, builds on 10 different Trilinos libraries
  • Each library has its own idioms and expected usage
  • Determining right way to compose and use libraries to solve a problem is difficult

Compositional complexity

  • Consider loosely-coupled multi-scale computational mechanics problem (developed by co-PI Arun Prakash)
  • Must determine right way to decompose problem, couple separate solutions, etc.

SLEEC-Compositional-Complexity.png


  • Simple case: fixed number of subdomains, only consider how to couple them together
  • Vast space of configurations: 8 subdomains → 135K possible schedules
  • Large variation in performance of different orders
  • Exploration of different variants requires knowledge of domain semantics, cost estimates

SLEEC-Runtime-Graph.png


Difficult interaction between libraries

  • Peridigm: computational peridynamics code
    • Allows modeling of materials under stress without explicit accounting for discontinuities (fractures, etc.)
  • Built on Trilinos components
    • Set of computation and communication libraries
  • Requires careful coordination of data movement operations to manage shadow data, etc. needed by solvers
    • But data movement requirements can be directly inferred from which equations are being solved

Prior Results

  • Exploiting library semantics to improve lock placement

SLEEC-Prior-Results-1.png

  • Exploiting library semantics to improve parallelism and locality

SLEEC-Prior-Results-2.png

Why not compilers

  • Compilers do not understand library calls as abstractions
    • Option one: see them as black boxes which give no information → no opportunity for optimization
    • Option two: break abstraction boundaries and try to optimize → many transformation opportunities are only possible by understanding semantics of abstractions
  • Needed: a way for compilers to understand abstractions
    • Broadway project attempted this, but focused on analyzing across abstractions, not semantics-driven transformations

Why not domain-specific languages?

  • DSLs are a great fit for this
    • Bake abstractions into the language
    • Optimize code at high level of abstraction based on semantic properties
    • Shown to be effective in various domains
      • SPL/Spiral for digital signal processing, Tensor contraction engine, etc.
  • But they are not generalizable
    • New domain? New DSL!
    • What about applications that span domains? (e.g., multiphysics codes)
  • Needed: a generic infrastructure for incorporating domain knowledge

Project Impact

Principles

  • Abstractions carried by domain libraries
    • Domain experts encode semantics, not compiler writers
    • Need effective annotation language for capturing semantics
  • Compiler should be domain agnostic
    • Same infrastructure used for optimization and transformation regardless of domain
    • Need common IR for capturing abstractions
  • Compiler should be able to optimize for various objectives
    • Do not want to focus solely on performance
    • Need generic optimization ability and cost models


Components

SLEEC-Overview.png


  • Annotation language for capturing semantic properties of domain libraries
  • High-level intermediate representation to represent programs that use annotated domain libraries
  • Transformation strategies that leverage annotations to perform semantics-driven code transformations
  • Optimization heuristics that use domain-specific cost models to find more efficient program variants
  • Iterative refinement techniques that let the compiler work with incomplete information and infer missing information when possible

Example

  • Consider annotated linear algebra library that supports two methods
    • Matrix multiply
    • Equation solve
  • Operations have mathematical properties that establish equivalence
    • Can solve ABx = b in two ways:
      • C = AB followed by Cx = b
      • Az = b followed by Bx = z
    • Latter may be more effective if A & B have special properties (e.g., triangular)

Program Code SLEEC-Program-Code.png


Abstract

  • Abstract into high level representation
  • Expression tree to capture flow of data
  • Library methods represented as high level operations
  • Operands can be subtrees, too, to support composition

SLEEC-Abstract-Transform.png

Transform

  • Transformations expressed as rewrite rules on expression trees
  • Rewrites match operation types (domain specific) but compiler applies them without understanding domain semantics

Concretize

  • Re-materialize back to source code, or transform to other, lower-level IR

SLEEC-Concretize.png

Annotation Language

  • Domain libraries annotated by domain experts to interface with compiler infrastructure
  • Questions
    • How to abstract libraries into IR
    • What kinds of transformations are legal
      • Represent as rewrite rules
      • How to verify? Can we synthesize?
    • How to concretize
      • Can this be inferred?

Cost models

  • Most annotations deal with library interface
    • Semantic properties are associated with library specification, not implementation
  • Can also provide cost estimates for library methods
    • Implementation and architecture specific
  • Can express other properties of implementation
    • Energy estimates
    • Accuracy information

Compiler Infrastructure

  • Compiler does not explicitly understand domains
    • But is extensible, allowing IR to be extended as new domains are added
  • Transformations are just pattern-matched rewrite rules
    • Can use domain-specific information such as domain-specific equivalences, domain-specific properties
    • Can also substitute equivalent implementations of same method
  • Generic compiler + annotated domain library = domain-specific compiler

Cost-drive optimization

  • Applying transformations to program generates semantically equivalent program variants
    • No “best” variant: different implementations will work better in different situations or optimize for different metrics
  • Compilation as optimization problem
    • Minimize objective function
      • FLOPs, energy efficiency, etc.
    • Subject to constraints
      • Semantically equivalent to original program, meets accuracy constraints, etc.
  • Same infrastructure can be used to optimize for a variety of metrics

Iterative Refinement

  • Typical problem with domain-specific languages or annotation approaches: what if program is incompletely annotated?
  • Want compiler to still produce useful results
  • Key property: compilation process is about optimization, not correctness
    • Lack of information does not raise correctness issues
    • As more annotations are provided, compilation results improve

Inference

  • Can we infer missing information?
  • Transformation annotations
    • Can we use synthesis techniques to infer legal transformations?
  • Cost models
    • Can we use machine learning techniques to build cost models automatically?


Potential Impacts

  • Programmability: Programmers can focus on developing methods, using high level libraries, without worrying about careful optimization
  • Performance portability: Ability to select between library variants automatically eases transition to new architectures
  • Scalability: Cost models can incorporate parallelism, locality, communication to enhance scalability
  • Energy efficiency: Parameterized compilation can optimize for energy use instead of performance without rewriting infrastructure
  • Resilience: Cost models can incorporate resilience information (e.g., algorithmic fault tolerance information), compilation can choose variants based on resilience properties


Implementation Plan

  • Work driven by “challenge” applications and domains
    • Computational mechanics and multiscale techniques (lead: Arun Prakash)
    • Peridynamics and Trilinos libraries (lead: Michael Parks)
  • Build compiler infrastructure in ROSE
    • Compiler infrastructure and optimization strategies (leads: Milind Kulkarni and Sam Midkiff)
    • Annotation language and IR (leads: Milind Kulkarni and Sam Midkiff)
    • Cost models and performance modeling (lead: Vijay Pai)


Concrete Deliverables

  • Annotation language
  • Common IR
  • Generic compiler infrastructure
  • “Showcase” annotated libraries

Products

Software Releases

  • SemCache Will provide annotated, concrete, domain-specific libraries that use SLEEC technology to automatically manage communication between CPU and GPU for codes using Trilinos/Kokkos. Integration in progress.

Presentations/Papers

  • "Exploiting Domain Knowledge to Optimize Parallel Computational Mechanics Codes." ICS 2013 (PDF)
  • "SemCache: Semantics-aware Caching for Efficient GPU Offloading." ICS 2013. (PDF)
  • SLEEC 2014 Progress Presentation (PDF)
  • "SemCache++: Semantics-aware Caching for Efficient Multi-GPU Offloading." ICS 2015. (PDF)
  • "Exploiting Domain Knowledge to Optimize Mesh Partitioning for Multiscale Methods." Supercomputing 2015 (Poster). (PDF)
  • "Optimizing the LULESH Stencil Code using Concurrent Collections." WolfHPC 2015. (PDF)