DSL's: Difference between revisions

Revision as of 16:26, April 29, 2014

Sonia requested that Saman Amarasinghe and Dan Quinlan initiate this page. For comments, please contact them. This page is still in development.

X-Stack Project	Name of the DSL	URL	Target domain	Miniapps supported	Front-end technology used	Internal representation used	Key Optimizations performed	Code generation technology used	Processors computing models targeted	Current status	Summary of the best results
D-TEC	Halide	http://halide-lang.org	Image processing algorithms	Cloverleaf, miniGMG, boxlib	Uses C++	Custom IR	Stencil optimizations (fusion, blocking, parallelization, vectorization) Schedules can produce all levels of locality, parallelism and redundant computation. OpenTuner for automatic schedule generation.	LLVM	X86 multicores, Arm and GPU	Working system. Used by Google and Adobe.	Local laplacian filter: Adobe top engineer took 3 months and 1500 loc to get 10x over original. Halide in 1-day, 60 lines 20x faster. In addition 90x faster GPU code in the same day (Adobe did not even try GPUs). Also, all the pictures taken by google glass is processed using a Halide pipeline.
DTEC	Shared Memory DSL	http://rosecompiler.org	MPI HPC applications on many core nodes	Internal LLNL App	Uses C (maybe C++ and Fortran in future)	ROSE IR	Shared memory optimization for MPI processes on many core architectures permits sharing large data structures between processes to reduce memory requirements per core.	ROSE	Many core architectures with local shared memory	Implementation released (4/28/2014)	Being evaluated for use
DSL 3
DSL 4
DSL 5
DSL 6
DSL 7
DSL 8

@@ Line 35: / Line 35: @@
 |Uses C (maybe C++ and Fortran in future)
 |ROSE IR
-|Share memory optimization for MPI processes on many core architectures permits sharing large data structures between processes to reduce memory requirements per core.
+|Shared memory optimization for MPI processes on many core architectures permits sharing large data structures between processes to reduce memory requirements per core.
 |ROSE
 |Many core architectures with local shared memory