Actions

X-Stack Project Publications: Difference between revisions

From Modelado Foundation

imported>ChunhuaLiao
(Created page with "==D-­TEC: DSL Technology for Exascale Computing== List * Markus Schordan, Pei-Hung Lin, Dan Quinlan, and Louis-Nol Pouchet. Veri�cation of polyhedral optimizations with con...")
 
imported>ChunhuaLiao
Line 1: Line 1:
==D-­TEC: DSL Technology for Exascale Computing==
==D-­TEC: DSL Technology for Exascale Computing==
List
List
* Markus Schordan, Pei-Hung Lin, Dan Quinlan, and Louis-Nol Pouchet. Veri�cation of polyhedral
* Markus Schordan, Pei-Hung Lin, Dan Quinlan, and Louis-Nol Pouchet. Verification of polyhedral
optimizations with constant loop bounds in �nite state space computations. In Tiziana
optimizations with constant loop bounds infinite state space computations. In Tiziana
Margaria and Bernhard Ste�en, editors, Leveraging Applications of Formal Methods, Veri�ca-
Margaria and Bernhard Steffen, editors, Leveraging Applications of Formal Methods, Verifica-
tion and Validation. Specialized Techniques and Applications, volume 8803 of Lecture Notes in
tion and Validation. Specialized Techniques and Applications, volume 8803 of Lecture Notes in
Computer Science, pages 493{508. Springer Berlin Heidelberg, 2014.
Computer Science, pages 493{508. Springer Berlin Heidelberg, 2014.
Line 22: Line 22:
the openmp accelerator model to port doe stencil applications, 2014. Poster presented at the
the openmp accelerator model to port doe stencil applications, 2014. Poster presented at the
Workshop on accelerator programming using directives, Nov. 17, 2014, New Orleans, LA.
Workshop on accelerator programming using directives, Nov. 17, 2014, New Orleans, LA.
* Markus Schordan, Pei-Hung Lin, Dan Quinlan, and Louis-Nol Pouchet. Veri�cation of parallel
* Markus Schordan, Pei-Hung Lin, Dan Quinlan, and Louis-Nol Pouchet. Verification of parallel
polyhedral transformations with arbitrary constant loop bounds, 2015. In review process of
polyhedral transformations with arbitrary constant loop bounds, 2015. In review process of
EuroPar2015.
EuroPar2015.
Line 30: Line 30:
* Jason Ansel. Autotuning Programs with Algorithmic Choice. Ph.d. thesis, Massachusetts Institute
* Jason Ansel. Autotuning Programs with Algorithmic Choice. Ph.d. thesis, Massachusetts Institute
of Technology, Cambridge, MA, February 2014.
of Technology, Cambridge, MA, February 2014.
* Je�rey Bosboom. Streamjit: A commensal compiler for high-performance stream programming.
* Jeffrey Bosboom. Streamjit: A commensal compiler for high-performance stream programming.
S.m. thesis, Massachusetts Institute of Technology, Cambridge, MA, June 2014.
S.m. thesis, Massachusetts Institute of Technology, Cambridge, MA, June 2014.
* Eric Wong. Optimizations in stream programming for multimedia applications. M.eng. thesis,
* Eric Wong. Optimizations in stream programming for multimedia applications. M.eng. thesis,
Line 36: Line 36:
* Phumpong Watanaprakornkul. Distributed data as a choice in petabricks. M.eng. thesis,
* Phumpong Watanaprakornkul. Distributed data as a choice in petabricks. M.eng. thesis,
Massachusetts Institute of Technology, Cambridge, MA, Jun 2012.
Massachusetts Institute of Technology, Cambridge, MA, Jun 2012.
* Charith Mendis, Je�rey Bosboom, Kevin Wu, Shoaib Kamil, Jonathan Ragan-Kelley, Sylvain
* Charith Mendis, Jeffrey Bosboom, Kevin Wu, Shoaib Kamil, Jonathan Ragan-Kelley, Sylvain
Paris, Qin Zhao, and Saman Amarasinghe. Helium: Lifting high-performance stencil kernels
Paris, Qin Zhao, and Saman Amarasinghe. Helium: Lifting high-performance stencil kernels
from stripped x86 binaries to halide dsl code. In ACM SIGPLAN Conference on Programming
from stripped x86 binaries to halide dsl code. In ACM SIGPLAN Conference on Programming
Language Design and Implementation, June 2015.
Language Design and Implementation, June 2015.
* Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Je�rey Bosboom,
* Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom,
Una-May O'Reilly, and Saman Amarasinghe. Opentuner: An extensible framework for
Una-May O'Reilly, and Saman Amarasinghe. Opentuner: An extensible framework for
program autotuning. In International Conference on Parallel Architectures and Compilation
program autotuning. In International Conference on Parallel Architectures and Compilation
Techniques, Edmonton, Canada, August 2014.
Techniques, Edmonton, Canada, August 2014.
* Je�rey Bosboom, Sumanaruban Rajadurai, Weng-Fai Wong, and Saman Amarasinghe.
* Jeffrey Bosboom, Sumanaruban Rajadurai, Weng-Fai Wong, and Saman Amarasinghe.
Streamjit: A commensal compiler for high-performance stream programming. In ACM SIG-
Streamjit: A commensal compiler for high-performance stream programming. In ACM SIG-
PLAN Conference on Object-Oriented Programming Systems and Applications, Portland, OR,
PLAN Conference on Object-Oriented Programming Systems and Applications, Portland, OR,
October 2014.
October 2014.
* Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Fr�edo Durand, and
* Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Fredo Durand, and
Saman Amarasinghe. Halide: A language and compiler for optimizing parallelism, locality, and
Saman Amarasinghe. Halide: A language and compiler for optimizing parallelism, locality, and
recomputation in image processing pipelines. In ACM SIGPLAN Conference on Programming
recomputation in image processing pipelines. In ACM SIGPLAN Conference on Programming
Line 64: Line 64:
Finland, Oct 2012.
Finland, Oct 2012.
[20] Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and
[20] Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and
Fr�edo Durand. Decoupling algorithms from schedules for easy optimization of image processing
Fredo Durand. Decoupling algorithms from schedules for easy optimization of image processing
pipelines. ACM Transactions on Graphics, 31(4), July 2012.
pipelines. ACM Transactions on Graphics, 31(4), July 2012.
[21] Dan Alistarh, Patrick Eugster, Maurice Herlihy, Alexander Matveev, and Nir Shavit. Stacktrack:
[21] Dan Alistarh, Patrick Eugster, Maurice Herlihy, Alexander Matveev, and Nir Shavit. Stacktrack:
Line 85: Line 85:
mance Computing, Networking, Storage and Analysis, SC '14, pages 311{322, Piscataway, NJ,
mance Computing, Networking, Storage and Analysis, SC '14, pages 311{322, Piscataway, NJ,
USA, 2014. IEEE Press.
USA, 2014. IEEE Press.
[26] F. Augustin and Y. ~ M. Marzouk. Uncertainty quanti�cation in high performance computing
[26] F. Augustin and Y. ~ M. Marzouk. Uncertainty quantification in high performance computing
(invited position paper). SIGPLAN Workshop on Probabilistic and Approximate Computing
(invited position paper). SIGPLAN Workshop on Probabilistic and Approximate Computing
(APPROX), 2014.
(APPROX), 2014.
Line 107: Line 107:
Architectures. PhD thesis, The Ohio State University, 2012.
Architectures. PhD thesis, The Ohio State University, 2012.
[33] Mahesh Ravishankar. Automatic parallelization of loops with data dependent control  
[33] Mahesh Ravishankar. Automatic parallelization of loops with data dependent control  
ow and
ow and array access patterns. PhD thesis, The Ohio State University, 2014.
array access patterns. PhD thesis, The Ohio State University, 2014.
[34] Kevin Alan Stock. Vectorization and Register Reuse in High Performance Computing. PhD
[34] Kevin Alan Stock. Vectorization and Register Reuse in High Performance Computing. PhD
thesis, The Ohio State University, 2014.
thesis, The Ohio State University, 2014.
[35] Tom Henretty, Richard Veras, Franz Franchetti, Louis-Noel Pouchet, J. Ramanujam, and
[35] Tom Henretty, Richard Veras, Franz Franchetti, Louis-Noel Pouchet, J. Ramanujam, and
P. Sadayappan. A stencil compiler for short-vector simd architectures. In Proceedings of the
P. Sadayappan. A stencil compiler for short-vector simd architectures. In Proceedings of the
27th International ACM Conference on International Conference on Supercomputing, ICS '13,
27th International ACM Conference on International Conference on Supercomputing, ICS '13,
pages 13{24, New York, NY, USA, 2013. ACM.
pages 13{24, New York, NY, USA, 2013. ACM.
[36] Justin Holewinski, Louis-Noel Pouchet, and P. Sadayappan. High-performance code generation
[36] Justin Holewinski, Louis-Noel Pouchet, and P. Sadayappan. High-performance code generation
for stencil computations on gpu architectures. In Proceedings of the 26th ACM International
for stencil computations on gpu architectures. In Proceedings of the 26th ACM International
Conference on Supercomputing, ICS '12, pages 311{320, New York, NY, USA, 2012. ACM.
Conference on Supercomputing, ICS '12, pages 311{320, New York, NY, USA, 2012. ACM.
[37] Louis-Noel Pouchet, Peng Zhang, P. Sadayappan, and Jason Cong. Polyhedral-based data
[37] Louis-Noel Pouchet, Peng Zhang, P. Sadayappan, and Jason Cong. Polyhedral-based data
reuse optimization for con�gurable computing. In Proceedings of the ACM/SIGDA International
reuse optimization for configurable computing. In Proceedings of the ACM/SIGDA International
Symposium on Field Programmable Gate Arrays, FPGA '13, pages 29{38, New York, NY, USA,
Symposium on Field Programmable Gate Arrays, FPGA '13, pages 29{38, New York, NY, USA,
2013. ACM.
2013. ACM.
Line 129: Line 128:
In Proceedings of the International Conference for High Performance Computing, Networking,
In Proceedings of the International Conference for High Performance Computing, Networking,
Storage and Analysis, SC '14, pages 375{386, Piscataway, NJ, USA, 2014. IEEE Press.
Storage and Analysis, SC '14, pages 375{386, Piscataway, NJ, USA, 2014. IEEE Press.
[40] Mahesh Ravishankar, John Eisenlohr, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev,
[40] Mahesh Ravishankar, John Eisenlohr, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev,
and P. Sadayappan. Code generation for parallel execution of a class of irregular loops on distributed
and P. Sadayappan. Code generation for parallel execution of a class of irregular loops on distributed
memory systems. In Proceedings of the International Conference on High Performance
memory systems. In Proceedings of the International Conference on High Performance
Line 137: Line 136:
and P. Sadayappan. Automatic parallelization of a class of irregular loops for distributed
and P. Sadayappan. Automatic parallelization of a class of irregular loops for distributed
memory systems. ACM Transactions on Parallel Computing, 1(1):7:1{7:37, September 2014.
memory systems. ACM Transactions on Parallel Computing, 1(1):7:1{7:37, September 2014.
[42] Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noel Pouchet, J Ramanujam,
[42] Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noel Pouchet, J Ramanujam,
Atanas Rountev, and P Sadayappan. Distributed memory code generation for mixed
Atanas Rountev, and P Sadayappan. Distributed memory code generation for mixed
irregular/regular computations. In Proceedings of the 20th ACM SIGPLAN Symposium on
irregular/regular computations. In Proceedings of the 20th ACM SIGPLAN Symposium on
Principles and Practice of Parallel Programming, pages 65{75. ACM, 2015.
Principles and Practice of Parallel Programming, pages 65{75. ACM, 2015.
[43] Kevin Stock, Martin Kong, Tobias Grosser, Louis-Noel Pouchet, Fabrice Rastello, J. Ramanujam,
[43] Kevin Stock, Martin Kong, Tobias Grosser, Louis-Noel Pouchet, Fabrice Rastello, J. Ramanujam,
and P. Sadayappan. A framework for enhancing data reuse via associative reordering.
and P. Sadayappan. A framework for enhancing data reuse via associative reordering.
In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and
In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and
Line 150: Line 149:
SPAA '14, pages 296{306, New York, NY, USA, 2014. ACM.
SPAA '14, pages 296{306, New York, NY, USA, 2014. ACM.
[45] Naznin Fauzia, Venmugil Elango, Mahesh Ravishankar, J. Ramanujam, Fabrice Rastello,
[45] Naznin Fauzia, Venmugil Elango, Mahesh Ravishankar, J. Ramanujam, Fabrice Rastello,
Atanas Rountev, Louis-Noel Pouchet, and P. Sadayappan. Beyond reuse distance analysis:
Atanas Rountev, Louis-Noel Pouchet, and P. Sadayappan. Beyond reuse distance analysis:
Dynamic analysis for characterization of data locality potential. ACM Trans. Archit. Code
Dynamic analysis for characterization of data locality potential. ACM Trans. Archit. Code
Optim., 10(4):53:1{53:29, Dec. 2013.
Optim., 10(4):53:1{53:29, Dec. 2013.
[46] Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-Noel Pouchet, and P. Sadayappan.
[46] Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-Noel Pouchet, and P. Sadayappan.
When polyhedral transformations meet simd code generation. In Proceedings of
When polyhedral transformations meet simd code generation. In Proceedings of
the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation,
the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation,
Line 175: Line 174:
2012.
2012.
[53] Phitchaya Mangpo Phothilimthana, Tikhon Jelvis, Rohin Shah, Nishant Totla, Sarah Chasins,
[53] Phitchaya Mangpo Phothilimthana, Tikhon Jelvis, Rohin Shah, Nishant Totla, Sarah Chasins,
and Rastislav Bod��k. Chlorophyll: synthesis-aided compiler for low-power spatial architectures.
and Rastislav Bodik. Chlorophyll: synthesis-aided compiler for low-power spatial architectures.
In O'Boyle and Pingali [58], page 42.
In O'Boyle and Pingali [58], page 42.
[54] Emina Torlak and Rastislav Bod��k. A lightweight symbolic virtual machine for solver-aided
[54] Emina Torlak and Rastislav Bodik. A lightweight symbolic virtual machine for solver-aided
host languages. In O'Boyle and Pingali [58], page 54.
host languages. In O'Boyle and Pingali [58], page 54.
[55] Rajeev Alur, Rastislav Bod��k, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman,
[55] Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman,
Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa.
Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa.
Syntax-guided synthesis. In Formal Methods in Computer-Aided Design, FMCAD 2013, Port-
Syntax-guided synthesis. In Formal Methods in Computer-Aided Design, FMCAD 2013, Port-
land, OR, USA, October 20-23, 2013, pages 1{8. IEEE, 2013.
land, OR, USA, October 20-23, 2013, pages 1{8. IEEE, 2013.
[56] Emina Torlak and Rastislav Bod��k. Growing solver-aided languages with rosette. In Antony L.
[56] Emina Torlak and Rastislav Bodik. Growing solver-aided languages with rosette. In Antony L.
Hosking, Patrick Th. Eugster, and Robert Hirschfeld, editors, ACM Symposium on New Ideas
Hosking, Patrick Th. Eugster, and Robert Hirschfeld, editors, ACM Symposium on New Ideas
in Programming and Re
in Programming and Re
ections on Software, Onward! 2013, part of SPLASH '13, Indianapolis,
ections on Software, Onward! 2013, part of SPLASH '13, Indianapolis,
IN, USA, October 26-31, 2013, pages 135{152. ACM, 2013.
IN, USA, October 26-31, 2013, pages 135{152. ACM, 2013.
[57] Leo A. Meyerovich, Matthew E. Torok, Eric Atkinson, and Rastislav Bod��k. Parallel schedule
[57] Leo A. Meyerovich, Matthew E. Torok, Eric Atkinson, and Rastislav Bodik. Parallel schedule
synthesis for attribute grammars. In Alex Nicolau, Xiaowei Shen, Saman P. Amarasinghe, and
synthesis for attribute grammars. In Alex Nicolau, Xiaowei Shen, Saman P. Amarasinghe, and
Richard W. Vuduc, editors, ACM SIGPLAN Symposium on Principles and Practice of Parallel
Richard W. Vuduc, editors, ACM SIGPLAN Symposium on Principles and Practice of Parallel
Line 201: Line 200:
[60] Tan Nguyen and Scott B. Baden. Bamboo-preliminary scaling results on multiple hybrid nodes
[60] Tan Nguyen and Scott B. Baden. Bamboo-preliminary scaling results on multiple hybrid nodes
of knights corner and sandy bridge processors. In Proc. WOLFHPC: Workshop on Domain-
of knights corner and sandy bridge processors. In Proc. WOLFHPC: Workshop on Domain-
Speci�c Languages and High-Level Frameworks for HPC, SC13, The International Conference
Specific Languages and High-Level Frameworks for HPC, SC13, The International Conference
for High Performance Computing, Networking, Storage and Analysis, Denver CO, 2013.
for High Performance Computing, Networking, Storage and Analysis, Denver CO, 2013.

Revision as of 20:42, June 18, 2015

D-­TEC: DSL Technology for Exascale Computing

List

  • Markus Schordan, Pei-Hung Lin, Dan Quinlan, and Louis-Nol Pouchet. Verification of polyhedral

optimizations with constant loop bounds infinite state space computations. In Tiziana Margaria and Bernhard Steffen, editors, Leveraging Applications of Formal Methods, Verifica- tion and Validation. Specialized Techniques and Applications, volume 8803 of Lecture Notes in Computer Science, pages 493{508. Springer Berlin Heidelberg, 2014.

  • Chunhua Liao, Daniel J. Quinlan, Thomas Panas, and Bronis R. de Supinski. A rose-based

openmp 3.0 research compiler supporting multiple runtime libraries. In Mitsuhisa Sato, Toshihiro Hanawa, Matthias S. Muller, Barbara M. Chapman, and Bronis R. de Supinski, editors, IWOMP, volume 6132 of Lecture Notes in Computer Science, pages 15{28. Springer, 2010.

  • Chunhua Liao, Yonghong Yan, Bronis R de Supinski, Daniel J Quinlan, and Barbara Chapman.

Early experiences with the openmp accelerator model. In OpenMP in the Era of Low Power Devices and Accelerators, pages 84{98. Springer, 2013.

  • Dan Quinlan and Chunhua Liao. The ROSE source-to-source compiler infrastructure. In Cetus

Users and Compiler Infrastructure Workshop, Galveston Island, TX, USA, October 2011.

  • Yonghong Yan, Pei-Hung Lin, Chunhua Liao, Bronis R. de Supinski, and Daniel J. Quinlan.

Supporting multiple accelerators in high-level programming models. In Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Many- cores, PMAM '15, pages 170{180, New York, NY, USA, 2015. ACM. Pei-Hung Lin, Chunhua Liao, Daniel J. Quinlan, and Stephen Guzik. Experiences of using the openmp accelerator model to port doe stencil applications, 2014. Poster presented at the Workshop on accelerator programming using directives, Nov. 17, 2014, New Orleans, LA.

  • Markus Schordan, Pei-Hung Lin, Dan Quinlan, and Louis-Nol Pouchet. Verification of parallel

polyhedral transformations with arbitrary constant loop bounds, 2015. In review process of EuroPar2015.

  • Jonathan Ragan-Kelley. Decoupling Algorithms from the Organization of Computation for High

Performance Image Processing. Ph.d. thesis, Massachusetts Institute of Technology, Cambridge, MA, June 2014.

  • Jason Ansel. Autotuning Programs with Algorithmic Choice. Ph.d. thesis, Massachusetts Institute

of Technology, Cambridge, MA, February 2014.

  • Jeffrey Bosboom. Streamjit: A commensal compiler for high-performance stream programming.

S.m. thesis, Massachusetts Institute of Technology, Cambridge, MA, June 2014.

  • Eric Wong. Optimizations in stream programming for multimedia applications. M.eng. thesis,

Massachusetts Institute of Technology, Cambridge, MA, Aug 2012.

  • Phumpong Watanaprakornkul. Distributed data as a choice in petabricks. M.eng. thesis,

Massachusetts Institute of Technology, Cambridge, MA, Jun 2012.

  • Charith Mendis, Jeffrey Bosboom, Kevin Wu, Shoaib Kamil, Jonathan Ragan-Kelley, Sylvain

Paris, Qin Zhao, and Saman Amarasinghe. Helium: Lifting high-performance stencil kernels from stripped x86 binaries to halide dsl code. In ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2015.

  • Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom,

Una-May O'Reilly, and Saman Amarasinghe. Opentuner: An extensible framework for program autotuning. In International Conference on Parallel Architectures and Compilation Techniques, Edmonton, Canada, August 2014.

  • Jeffrey Bosboom, Sumanaruban Rajadurai, Weng-Fai Wong, and Saman Amarasinghe.

Streamjit: A commensal compiler for high-performance stream programming. In ACM SIG- PLAN Conference on Object-Oriented Programming Systems and Applications, Portland, OR, October 2014.

  • Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Fredo Durand, and

Saman Amarasinghe. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In ACM SIGPLAN Conference on Programming Language Design and Implementation, Seattle, WA, June 2013.

  • Phitchaya Mangpo Phothilimthana, Jason Ansel, Jonathan Ragan-Kelley, and Saman Amarasinghe.

Portable performance on heterogeneous architectures. In The International Conference on Architectural Support for Programming Languages and Operating Systems, Houston, TX, March 2013. [18] Maciej Pacula, Jason Ansel, Saman Amarasinghe, and Una-May O'Reilly. Hyperparameter tuning in bandit-based adaptive operator selection. In European Conference on the Applications of Evolutionary Computation, Malaga, Spain, Apr 2012. [19] Jason Ansel, Maciej Pacula, Yee Lok Wong, Cy Chan, Marek Olszewski, Una-May O'Reilly, and Saman Amarasinghe. Siblingrivalry: Online autotuning through local competitions. In International Conference on Compilers Architecture and Synthesis for Embedded Systems, Tampere, Finland, Oct 2012. [20] Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Fredo Durand. Decoupling algorithms from schedules for easy optimization of image processing pipelines. ACM Transactions on Graphics, 31(4), July 2012. [21] Dan Alistarh, Patrick Eugster, Maurice Herlihy, Alexander Matveev, and Nir Shavit. Stacktrack: An automated transactional approach to concurrent memory reclamation. In Proceedings of the Ninth European Conference on Computer Systems, EuroSys '14, pages 25:1{25:14, New York, NY, USA, 2014. ACM. [22] Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Una-May O'Reilly, and Saman Amarasinghe. Opentuner: An extensible framework for program autotuning. Technical Report MIT/CSAIL Technical Report MIT-CSAIL-TR-2013-026, Massachusetts Institute of Technology, Cambridge, MA, Nov 2013. [23] Alexander Matveev and Nir Shavit. Reduced hardware norec: A safe and scalable hybrid transactional memory. In 20th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2015, Istanbul, Turkey, 2015. ACM. [24] Sasa Misailovic, Michael Carbin, Sara Achour, Zichao Qi, and Martin C. Rinard. Chisel: Reliability- and accuracy-aware optimization of approximate computational kernels. In Pro- ceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA '14, pages 309{328, New York, NY, USA, 2014. ACM. [25] Zhilei Xu, Shoaib Kamil, and Armando Solar-Lezama. Msl: A synthesis enabled language for distributed implementations. In Proceedings of the International Conference for High Perfor- mance Computing, Networking, Storage and Analysis, SC '14, pages 311{322, Piscataway, NJ, USA, 2014. IEEE Press. [26] F. Augustin and Y. ~ M. Marzouk. Uncertainty quantification in high performance computing (invited position paper). SIGPLAN Workshop on Probabilistic and Approximate Computing (APPROX), 2014. [27] David Grove, Josh Milthorpe, and Olivier Tardieu. Supporting array programming in X10. In Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, ARRAY'14, pages 38:38{38:43, New York, NY, USA, 2014. ACM. [28] Wei Zhang, Olivier Tardieu, David Grove, Benjamin Herta, Tomio Kamada, Vijay Saraswat, and Mikio Takeuchi. GLB: Lifeline-based global load balancing library in X10. In Proceedings of the First Workshop on Parallel Programming for Analytics Applications, PPAA '14, pages 31{40, New York, NY, USA, 2014. ACM. [29] Olivier Tardieu, David Grove, Benjamin Herta, Tomio Kamada, Vijay Saraswat, Mikio Takeuchi, and Wei Zhang. X10 for Productivity and Performance at Scale: A Submission to the 2013 HPC Class II Challenge, October 2013. [30] Craig Rasmussen, Matthew Sottile, Daniel Nagle, and Soren Rasmussen. Locally-oriented programming: A simple programming model for stencil-based computations on multi-level distributed memory architectures. In Proceedings of Euro-Par 2015 Parallel Processing, Lecture Notes in Computer Science. Springer International Publishing, 2015. Submitted, February 2015. [31] Thomas Steel Henretty. Performance Optimization of Stencil Computations on Modern SIMD Architectures. PhD thesis, The Ohio State University, 2014. [32] Justin Andrew Holewinski. Automatic Code Generation for Stencil Computations on GPU Architectures. PhD thesis, The Ohio State University, 2012. [33] Mahesh Ravishankar. Automatic parallelization of loops with data dependent control ow and array access patterns. PhD thesis, The Ohio State University, 2014. [34] Kevin Alan Stock. Vectorization and Register Reuse in High Performance Computing. PhD thesis, The Ohio State University, 2014. [35] Tom Henretty, Richard Veras, Franz Franchetti, Louis-Noel Pouchet, J. Ramanujam, and P. Sadayappan. A stencil compiler for short-vector simd architectures. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS '13, pages 13{24, New York, NY, USA, 2013. ACM. [36] Justin Holewinski, Louis-Noel Pouchet, and P. Sadayappan. High-performance code generation for stencil computations on gpu architectures. In Proceedings of the 26th ACM International Conference on Supercomputing, ICS '12, pages 311{320, New York, NY, USA, 2012. ACM. [37] Louis-Noel Pouchet, Peng Zhang, P. Sadayappan, and Jason Cong. Polyhedral-based data reuse optimization for configurable computing. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA '13, pages 29{38, New York, NY, USA, 2013. ACM. [38] S. Rajbhandari, A. Nikam, Pai-Wei Lai, K. Stock, S. Krishnamoorthy, and P. Sadayappan. Cast: Contraction algorithm for symmetric tensors. In Parallel Processing (ICPP), 2014 43rd International Conference on, pages 261{272, Sept 2014. [39] Samyam Rajbhandari, Akshay Nikam, Pai-Wei Lai, Kevin Stock, Sriram Krishnamoorthy, and P. Sadayappan. A communication-optimal framework for contracting distributed tensors. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '14, pages 375{386, Piscataway, NJ, USA, 2014. IEEE Press. [40] Mahesh Ravishankar, John Eisenlohr, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan. Code generation for parallel execution of a class of irregular loops on distributed memory systems. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, pages 72:1{72:11, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press. [41] Mahesh Ravishankar, John Eisenlohr, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan. Automatic parallelization of a class of irregular loops for distributed memory systems. ACM Transactions on Parallel Computing, 1(1):7:1{7:37, September 2014. [42] Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noel Pouchet, J Ramanujam, Atanas Rountev, and P Sadayappan. Distributed memory code generation for mixed irregular/regular computations. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 65{75. ACM, 2015. [43] Kevin Stock, Martin Kong, Tobias Grosser, Louis-Noel Pouchet, Fabrice Rastello, J. Ramanujam, and P. Sadayappan. A framework for enhancing data reuse via associative reordering. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, pages 65{76, New York, NY, USA, 2014. ACM. [44] Venmugil Elango, Fabrice Rastello, Louis-No?el Pouchet, J. Ramanujam, and P. Sadayappan. On characterizing the data movement complexity of computational dags for parallel execution. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '14, pages 296{306, New York, NY, USA, 2014. ACM. [45] Naznin Fauzia, Venmugil Elango, Mahesh Ravishankar, J. Ramanujam, Fabrice Rastello, Atanas Rountev, Louis-Noel Pouchet, and P. Sadayappan. Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential. ACM Trans. Archit. Code Optim., 10(4):53:1{53:29, Dec. 2013. [46] Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-Noel Pouchet, and P. Sadayappan. When polyhedral transformations meet simd code generation. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '13, pages 127{138, New York, NY, USA, 2013. ACM. [47] Lai Wei and John Mellor-Crummey. Autotuning tensor transposition. In Proceedings of the 19th International Workshop on High-level Parallel Programming Models and Supportive Envi- ronments, May 2014. [48] Vivek Sarkar. Jun Shirako, Louis-Noel Pouchet. Oil and water can mix: An integration of polyhedral and ast-based transformations. In IEEE Conference on High Performance Computing, Networking, Storage and Analysis (SC'14). IEEE, 2014. [49] Vivek Sarkar. Prasanth Chatarasi, Jun Shirako. Polyhedral transformations of explicitly parallel programs. In 5th International Workshop on Polyhedral Compilation Techniques (IMPACT 2015). IEEE, 2015. [50] Kamal Sharma. Locality Transformations of Computation and Data for Portable Performance. PhD thesis, Rice University, August 2014. [51] Jun Shirako and Vivek Sarkar. Oil and water can mix! Experiences with integrating polyhedral and AST-based Transformations. In 17th Workshop on Compilers for Parallel Programming, July 2013. [52] Jisheng Zhao, Michael Burke, and Vivek Sarkar. Rice ROSE Compositional Analysis and Transformation Framework (R2CAT). Technical report, LLNL Technical Report 590233, October 2012. [53] Phitchaya Mangpo Phothilimthana, Tikhon Jelvis, Rohin Shah, Nishant Totla, Sarah Chasins, and Rastislav Bodik. Chlorophyll: synthesis-aided compiler for low-power spatial architectures. In O'Boyle and Pingali [58], page 42. [54] Emina Torlak and Rastislav Bodik. A lightweight symbolic virtual machine for solver-aided host languages. In O'Boyle and Pingali [58], page 54. [55] Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. Syntax-guided synthesis. In Formal Methods in Computer-Aided Design, FMCAD 2013, Port- land, OR, USA, October 20-23, 2013, pages 1{8. IEEE, 2013. [56] Emina Torlak and Rastislav Bodik. Growing solver-aided languages with rosette. In Antony L. Hosking, Patrick Th. Eugster, and Robert Hirschfeld, editors, ACM Symposium on New Ideas in Programming and Re ections on Software, Onward! 2013, part of SPLASH '13, Indianapolis, IN, USA, October 26-31, 2013, pages 135{152. ACM, 2013. [57] Leo A. Meyerovich, Matthew E. Torok, Eric Atkinson, and Rastislav Bodik. Parallel schedule synthesis for attribute grammars. In Alex Nicolau, Xiaowei Shen, Saman P. Amarasinghe, and Richard W. Vuduc, editors, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '13, Shenzhen, China, February 23-27, 2013, pages 187{196. ACM, 2013. [58] Michael F. P. O'Boyle and Keshav Pingali, editors. ACM SIGPLAN Conference on Program- ming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom - June 09 - 11, 2014. ACM, 2014. [59] Tan Nguyen, Pietro Cicotti, Eric Bylaska, Dan Quinlan, and Scott B. Baden. Bamboo: translating mpi applications to a latency-tolerant, data-driven form. In Proceedings of the Interna- tional Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, pages 39:1{39:11, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press. [60] Tan Nguyen and Scott B. Baden. Bamboo-preliminary scaling results on multiple hybrid nodes of knights corner and sandy bridge processors. In Proc. WOLFHPC: Workshop on Domain- Specific Languages and High-Level Frameworks for HPC, SC13, The International Conference for High Performance Computing, Networking, Storage and Analysis, Denver CO, 2013.