Scientific Libraries: Difference between revisions
From Modelado Foundation
imported>Jsstone1 No edit summary |
imported>Mhall No edit summary |
||
(3 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
Sonia requested that Milind Kulkarni initiate this page. For comments, please contact Milind. | Sonia requested that Milind Kulkarni initiate this page. For comments, please contact Milind. | ||
{ | {{PITable}} | ||
'''Questions:''' | |||
* [[#table1|Describe how you expect to target (optimize/analyze) applications written using existing computational libraries]] | |||
* [[#table2|Many computational libraries (e.g., Kokkos in Trilinos) provide support for managing data distribution and communication. Describe how your project targets applications that use such libraries.]] | |||
* [[#table3|If your project aims to develop new programming models, describe any plans to integrate existing computational libraries into the model, or how you will transition applications written using such libraries to your model.]] | |||
* [[#table4|What sorts of properties (semantics of computation, information about data usage, etc.) would you find useful to your project if captured by computational libraries? ]] | |||
<div id="table1"></div>{{ | |||
CorrectnessToolsTable | |||
|question=Describe how you expect to target (optimize/analyze) applications written using existing computational libraries | |||
|xpressanswer=Libraries written in MPI with C will run on XPRESS systems using UH libraries combined with ParalleX XPI/HPX interoperability interfaces. It is expected that future or important libraries will be developed employing new execution methods/interfaces. | |||
| | |||
|tganswer=OCR scheduler will optimize execution of code generated by R-Stream. | |||
|Libraries written in MPI with C will run on XPRESS systems using UH libraries combined with ParalleX XPI/HPX interoperability interfaces. It is expected that future or important libraries will be developed employing new execution methods/interfaces. | |||
|OCR scheduler will optimize execution of code generated by R-Stream. | |degasanswer= | ||
| | |||
| | |dtecanswer=Where appropriate library abstractions will be provided with compiler support (typically for finer granularity abstractions at an expression or statement level). Source-to-source transformations will rewrite the code to leverage abstraction semantics and program analysis used to identify the restricted contexts to support the generation of the most efficient code. Fundamentally, libraries can't see how their abstractions are used within an application were as the compiler can do so readily and use such information to generate tailored code. | ||
|( | |||
| | |dynaxanswer=We are focusing on ways to identify scalable and resilient data access and movement patterns, and express them efficiently in task-based runtimes. For computational libraries which do not already provide such semantics, alternative means must be found. (For instance, a LAPACK SVD call can be replaced with a distributed, more scalable equivalent.) | ||
| | |||
| | |xtuneanswer=Work on autotuning to select among code variants could be applied to libraries that provide multiple implementations of the same computation. The key idea is to build a model for variant selection based on features of input data, and use this to make run-time selection decisions. | ||
| | |||
|Support optimization efforts with tools that can capture some of the internal semantics of a given library (e.g., levels of multigrid V cycle or patches in an AMR library) | |gvranswer= | ||
| | |corvetteanswer= | ||
|This issue is unresolved | |||
|The OCR tuning hints framework can be used for user directed management of data and communication. | |sleecanswer= | ||
| | |||
| | |piperanswer=Support optimization efforts with tools that can capture some of the internal semantics of a given library (e.g., levels of multigrid V cycle or patches in an AMR library) | ||
| | |||
| | }} | ||
| | |||
| | <div id="table2"></div>{{ | ||
|N/A | CorrectnessToolsTable | ||
|PIPER will provide stack wide instrumentation to facilitate optimization - access to internal information only known to the library should be exported to tools through appropriate APIs (preferably through similar and interoperable APIs) | |question=Many computational libraries (e.g., Kokkos in Trilinos) provide support for managing data distribution and communication. Describe how your project targets applications that use such libraries. | ||
| | |xpressanswer=This issue is unresolved | ||
|Low-level system oriented libraries such as STDIO will be employed by the LXK and HPX systems among others. No scientific libraries per say will be built in the systems as intrinsics below the compiler level. Over time many libraries will be ported to the ParalleX model for dramatic improvements in efficiency and scalability. | |||
|R-Stream compiler | |tganswer=The OCR tuning hints framework can be used for user directed management of data and communication. | ||
| | |||
| | |degasanswer= | ||
| | |||
| | |dtecanswer=We expect to leverage existing libraries and runtime systems (most commonly implemented as libraries) as needed. The X10 runtime system will be used, for example, to abstract communication between distributed memory processors. Other communication libraries (e.g. MPI) are being use to both simplify the generation of code by the compiler and leverage specific semantics that can, with program analysis, be used to rewrite application code to make it more efficient and/or leverage specific Exascale hardware features. | ||
| | |||
| | |dynaxanswer=Such libraries often have their own system/runtime requirements. If those requirements line up with the requirements of the application, no further adaptation is necessary. Otherwise, such a library could possibly be used through some form of adaptation layer, or the algorithm could simply be ported to run on the necessary software stack, directly. This demonstrates a need for interoperability, which we feel is an area that needs to be explored further. | ||
|N/A | |||
|N/A | |xtuneanswer=There is an opportunity to apply autotuning to such decisions. | ||
| | |gvranswer= | ||
|Libraries crafted in a form that eliminated global barriers, worked on globally addressed objects, and exploited message driven computation would greatly facilitate the porting of conventional rigid models to future dynamic adaptive and scalable models such as the ParalleX based methods. | |||
|Affinities, priorities, accuracy expectations, critical/non-critical tasks and data. | |corvetteanswer= | ||
| | |||
| | |sleecanswer=N/A | ||
| | |||
| | |piperanswer=PIPER will provide stack wide instrumentation to facilitate optimization - access to internal information only known to the library should be exported to tools through appropriate APIs (preferably through similar and interoperable APIs) | ||
| | |||
| | }} | ||
| | |||
|Attribution information about internal data structures (e.g., data distributions, patch information for AMR) as well as phase/time slice information | <div id="table3"></div>{{ | ||
CorrectnessToolsTable | |||
|question=If your project aims to develop new programming models, describe any plans to integrate existing computational libraries into the model, or how you will transition applications written using such libraries to your model. | |||
|xpressanswer=Low-level system oriented libraries such as STDIO will be employed by the LXK and HPX systems among others. No scientific libraries per say will be built in the systems as intrinsics below the compiler level. Over time many libraries will be ported to the ParalleX model for dramatic improvements in efficiency and scalability. | |||
|tganswer=R-Stream compiler | |||
|degasanswer= | |||
|dtecanswer=Our research work supports more of how to build the compiler support for programming models than a focus on a specific DSL or programming model. However, specific work relative to MPI is leveraging MPI semantics to rewrite application code (via compiler source-to-source transformations) to better overlap communication and computation. This is done as one of many building blocks from which to construct DSLs that would implement numerous programming models. Other work on how to leverage semantics in existing HPC code is targeting the rewriting of the code to target both single and multiple GPUs per node, this work leverages several OpenMP runtime libraries. | |||
|dynaxanswer=The HTA programming model may result in new generation of computational libraries. | |||
|xtuneanswer=N/A | |||
|gvranswer= | |||
|corvetteanswer= | |||
|sleecanswer=N/A | |||
|piperanswer=N/A | |||
}} | |||
<div id="table4"></div>{{ | |||
CorrectnessToolsTable | |||
|question=What sorts of properties (semantics of computation, information about data usage, etc.) would you find useful to your project if captured by computational libraries? | |||
|xpressanswer=Libraries crafted in a form that eliminated global barriers, worked on globally addressed objects, and exploited message driven computation would greatly facilitate the porting of conventional rigid models to future dynamic adaptive and scalable models such as the ParalleX based methods. | |||
|tganswer=Affinities, priorities, accuracy expectations, critical/non-critical tasks and data. | |||
|degasanswer= | |||
|dtecanswer=Libraries should present simple user level API's with clear semantics. A relatively course level of granularity of semantics is required to avoid library use from contributing to abstraction penalties. Appropriate properties for libraries are data handling specific to Adaptive Mesh Refinement, data management associated with many-core optimizations, etc. Actual use or fine-grain access of data abstractions via libraries can be a problem for general purpose compilers to optimize. | |||
|dynaxanswer=Wide availability / compatibility with multiple runtimes would help reduce effort. The ability to tune performance not only for a single library call, but across the application as a whole, would be beneficial. | |||
Algorithms vary widely in their data access patterns, and this means that, for a particular algorithm, some data distributions are much more suitable than others. An application developer may have full control of the data's distribution before the application calls the computational library, but has no idea what data access pattern the library uses internally, and therefore, performance is lost by rearranging data unnecessarily. Some feedback from the library would be helpful for preventing that kind of performance loss. | |||
|xtuneanswer=Affinity information would be helpful. | |||
|gvranswer= | |||
|corvetteanswer= | |||
|sleecanswer= | |||
|piperanswer=Attribution information about internal data structures (e.g., data distributions, patch information for AMR) as well as phase/time slice information | |||
}} |
Latest revision as of 23:10, May 20, 2014
Sonia requested that Milind Kulkarni initiate this page. For comments, please contact Milind.
PI | |
XPRESS | Ron Brightwell |
TG | Shekhar Borkar |
DEGAS | Katherine Yelick |
D-TEC | Daniel Quinlan |
DynAX | Guang Gao |
X-TUNE | Mary Hall |
GVR | Andrew Chien |
CORVETTE | Koushik Sen |
SLEEC | Milind Kulkarni |
PIPER | Martin Schulz |
Questions:
- Describe how you expect to target (optimize/analyze) applications written using existing computational libraries
- Many computational libraries (e.g., Kokkos in Trilinos) provide support for managing data distribution and communication. Describe how your project targets applications that use such libraries.
- If your project aims to develop new programming models, describe any plans to integrate existing computational libraries into the model, or how you will transition applications written using such libraries to your model.
- What sorts of properties (semantics of computation, information about data usage, etc.) would you find useful to your project if captured by computational libraries?
Describe how you expect to target (optimize/analyze) applications written using existing computational libraries | |
XPRESS | Libraries written in MPI with C will run on XPRESS systems using UH libraries combined with ParalleX XPI/HPX interoperability interfaces. It is expected that future or important libraries will be developed employing new execution methods/interfaces. |
TG | OCR scheduler will optimize execution of code generated by R-Stream. |
DEGAS | |
D-TEC | Where appropriate library abstractions will be provided with compiler support (typically for finer granularity abstractions at an expression or statement level). Source-to-source transformations will rewrite the code to leverage abstraction semantics and program analysis used to identify the restricted contexts to support the generation of the most efficient code. Fundamentally, libraries can't see how their abstractions are used within an application were as the compiler can do so readily and use such information to generate tailored code. |
DynAX | We are focusing on ways to identify scalable and resilient data access and movement patterns, and express them efficiently in task-based runtimes. For computational libraries which do not already provide such semantics, alternative means must be found. (For instance, a LAPACK SVD call can be replaced with a distributed, more scalable equivalent.) |
X-TUNE | Work on autotuning to select among code variants could be applied to libraries that provide multiple implementations of the same computation. The key idea is to build a model for variant selection based on features of input data, and use this to make run-time selection decisions. |
GVR | |
CORVETTE | |
SLEEC | |
PIPER | Support optimization efforts with tools that can capture some of the internal semantics of a given library (e.g., levels of multigrid V cycle or patches in an AMR library) |
Many computational libraries (e.g., Kokkos in Trilinos) provide support for managing data distribution and communication. Describe how your project targets applications that use such libraries. | |
XPRESS | This issue is unresolved |
TG | The OCR tuning hints framework can be used for user directed management of data and communication. |
DEGAS | |
D-TEC | We expect to leverage existing libraries and runtime systems (most commonly implemented as libraries) as needed. The X10 runtime system will be used, for example, to abstract communication between distributed memory processors. Other communication libraries (e.g. MPI) are being use to both simplify the generation of code by the compiler and leverage specific semantics that can, with program analysis, be used to rewrite application code to make it more efficient and/or leverage specific Exascale hardware features. |
DynAX | Such libraries often have their own system/runtime requirements. If those requirements line up with the requirements of the application, no further adaptation is necessary. Otherwise, such a library could possibly be used through some form of adaptation layer, or the algorithm could simply be ported to run on the necessary software stack, directly. This demonstrates a need for interoperability, which we feel is an area that needs to be explored further. |
X-TUNE | There is an opportunity to apply autotuning to such decisions. |
GVR | |
CORVETTE | |
SLEEC | N/A |
PIPER | PIPER will provide stack wide instrumentation to facilitate optimization - access to internal information only known to the library should be exported to tools through appropriate APIs (preferably through similar and interoperable APIs) |
If your project aims to develop new programming models, describe any plans to integrate existing computational libraries into the model, or how you will transition applications written using such libraries to your model. | |
XPRESS | Low-level system oriented libraries such as STDIO will be employed by the LXK and HPX systems among others. No scientific libraries per say will be built in the systems as intrinsics below the compiler level. Over time many libraries will be ported to the ParalleX model for dramatic improvements in efficiency and scalability. |
TG | R-Stream compiler |
DEGAS | |
D-TEC | Our research work supports more of how to build the compiler support for programming models than a focus on a specific DSL or programming model. However, specific work relative to MPI is leveraging MPI semantics to rewrite application code (via compiler source-to-source transformations) to better overlap communication and computation. This is done as one of many building blocks from which to construct DSLs that would implement numerous programming models. Other work on how to leverage semantics in existing HPC code is targeting the rewriting of the code to target both single and multiple GPUs per node, this work leverages several OpenMP runtime libraries. |
DynAX | The HTA programming model may result in new generation of computational libraries. |
X-TUNE | N/A |
GVR | |
CORVETTE | |
SLEEC | N/A |
PIPER | N/A |
What sorts of properties (semantics of computation, information about data usage, etc.) would you find useful to your project if captured by computational libraries? | |
XPRESS | Libraries crafted in a form that eliminated global barriers, worked on globally addressed objects, and exploited message driven computation would greatly facilitate the porting of conventional rigid models to future dynamic adaptive and scalable models such as the ParalleX based methods. |
TG | Affinities, priorities, accuracy expectations, critical/non-critical tasks and data. |
DEGAS | |
D-TEC | Libraries should present simple user level API's with clear semantics. A relatively course level of granularity of semantics is required to avoid library use from contributing to abstraction penalties. Appropriate properties for libraries are data handling specific to Adaptive Mesh Refinement, data management associated with many-core optimizations, etc. Actual use or fine-grain access of data abstractions via libraries can be a problem for general purpose compilers to optimize. |
DynAX | Wide availability / compatibility with multiple runtimes would help reduce effort. The ability to tune performance not only for a single library call, but across the application as a whole, would be beneficial.
Algorithms vary widely in their data access patterns, and this means that, for a particular algorithm, some data distributions are much more suitable than others. An application developer may have full control of the data's distribution before the application calls the computational library, but has no idea what data access pattern the library uses internally, and therefore, performance is lost by rearranging data unnecessarily. Some feedback from the library would be helpful for preventing that kind of performance loss. |
X-TUNE | Affinity information would be helpful. |
GVR | |
CORVETTE | |
SLEEC | |
PIPER | Attribution information about internal data structures (e.g., data distributions, patch information for AMR) as well as phase/time slice information |