X-Stack Project Publications

D-TEC: DSL Technology for Exascale Computing

List

Markus Schordan, Pei-Hung Lin, Dan Quinlan, and Louis-Nol Pouchet. Verification of polyhedral

optimizations with constant loop bounds infinite state space computations. In Tiziana Margaria and Bernhard Steffen, editors, Leveraging Applications of Formal Methods, Verifica- tion and Validation. Specialized Techniques and Applications, volume 8803 of Lecture Notes in Computer Science, pages 493{508. Springer Berlin Heidelberg, 2014.

Chunhua Liao, Daniel J. Quinlan, Thomas Panas, and Bronis R. de Supinski. A rose-based

openmp 3.0 research compiler supporting multiple runtime libraries. In Mitsuhisa Sato, Toshihiro Hanawa, Matthias S. Muller, Barbara M. Chapman, and Bronis R. de Supinski, editors, IWOMP, volume 6132 of Lecture Notes in Computer Science, pages 15{28. Springer, 2010.

Chunhua Liao, Yonghong Yan, Bronis R de Supinski, Daniel J Quinlan, and Barbara Chapman.

Early experiences with the openmp accelerator model. In OpenMP in the Era of Low Power Devices and Accelerators, pages 84{98. Springer, 2013.

Dan Quinlan and Chunhua Liao. The ROSE source-to-source compiler infrastructure. In Cetus

Users and Compiler Infrastructure Workshop, Galveston Island, TX, USA, October 2011.

Yonghong Yan, Pei-Hung Lin, Chunhua Liao, Bronis R. de Supinski, and Daniel J. Quinlan.

Supporting multiple accelerators in high-level programming models. In Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Many- cores, PMAM '15, pages 170{180, New York, NY, USA, 2015. ACM. Pei-Hung Lin, Chunhua Liao, Daniel J. Quinlan, and Stephen Guzik. Experiences of using the openmp accelerator model to port doe stencil applications, 2014. Poster presented at the Workshop on accelerator programming using directives, Nov. 17, 2014, New Orleans, LA.

Markus Schordan, Pei-Hung Lin, Dan Quinlan, and Louis-Nol Pouchet. Verification of parallel

polyhedral transformations with arbitrary constant loop bounds, 2015. In review process of EuroPar2015.

Jonathan Ragan-Kelley. Decoupling Algorithms from the Organization of Computation for High

Performance Image Processing. Ph.d. thesis, Massachusetts Institute of Technology, Cambridge, MA, June 2014.

Jason Ansel. Autotuning Programs with Algorithmic Choice. Ph.d. thesis, Massachusetts Institute

of Technology, Cambridge, MA, February 2014.

Jeffrey Bosboom. Streamjit: A commensal compiler for high-performance stream programming.

S.m. thesis, Massachusetts Institute of Technology, Cambridge, MA, June 2014.

Eric Wong. Optimizations in stream programming for multimedia applications. M.eng. thesis,

Massachusetts Institute of Technology, Cambridge, MA, Aug 2012.

Phumpong Watanaprakornkul. Distributed data as a choice in petabricks. M.eng. thesis,

Massachusetts Institute of Technology, Cambridge, MA, Jun 2012.

Charith Mendis, Jeffrey Bosboom, Kevin Wu, Shoaib Kamil, Jonathan Ragan-Kelley, Sylvain

Paris, Qin Zhao, and Saman Amarasinghe. Helium: Lifting high-performance stencil kernels from stripped x86 binaries to halide dsl code. In ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2015.

Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom,

Una-May O'Reilly, and Saman Amarasinghe. Opentuner: An extensible framework for program autotuning. In International Conference on Parallel Architectures and Compilation Techniques, Edmonton, Canada, August 2014.

Jeffrey Bosboom, Sumanaruban Rajadurai, Weng-Fai Wong, and Saman Amarasinghe.

Streamjit: A commensal compiler for high-performance stream programming. In ACM SIG- PLAN Conference on Object-Oriented Programming Systems and Applications, Portland, OR, October 2014.

Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Fredo Durand, and

Saman Amarasinghe. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In ACM SIGPLAN Conference on Programming Language Design and Implementation, Seattle, WA, June 2013.

Phitchaya Mangpo Phothilimthana, Jason Ansel, Jonathan Ragan-Kelley, and Saman Amarasinghe.

Portable performance on heterogeneous architectures. In The International Conference on Architectural Support for Programming Languages and Operating Systems, Houston, TX, March 2013. [18] Maciej Pacula, Jason Ansel, Saman Amarasinghe, and Una-May O'Reilly. Hyperparameter tuning in bandit-based adaptive operator selection. In European Conference on the Applications of Evolutionary Computation, Malaga, Spain, Apr 2012. [19] Jason Ansel, Maciej Pacula, Yee Lok Wong, Cy Chan, Marek Olszewski, Una-May O'Reilly, and Saman Amarasinghe. Siblingrivalry: Online autotuning through local competitions. In International Conference on Compilers Architecture and Synthesis for Embedded Systems, Tampere, Finland, Oct 2012. [20] Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Fredo Durand. Decoupling algorithms from schedules for easy optimization of image processing pipelines. ACM Transactions on Graphics, 31(4), July 2012. [21] Dan Alistarh, Patrick Eugster, Maurice Herlihy, Alexander Matveev, and Nir Shavit. Stacktrack: An automated transactional approach to concurrent memory reclamation. In Proceedings of the Ninth European Conference on Computer Systems, EuroSys '14, pages 25:1{25:14, New York, NY, USA, 2014. ACM. [22] Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Una-May O'Reilly, and Saman Amarasinghe. Opentuner: An extensible framework for program autotuning. Technical Report MIT/CSAIL Technical Report MIT-CSAIL-TR-2013-026, Massachusetts Institute of Technology, Cambridge, MA, Nov 2013. [23] Alexander Matveev and Nir Shavit. Reduced hardware norec: A safe and scalable hybrid transactional memory. In 20th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2015, Istanbul, Turkey, 2015. ACM. [24] Sasa Misailovic, Michael Carbin, Sara Achour, Zichao Qi, and Martin C. Rinard. Chisel: Reliability- and accuracy-aware optimization of approximate computational kernels. In Pro- ceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA '14, pages 309{328, New York, NY, USA, 2014. ACM. [25] Zhilei Xu, Shoaib Kamil, and Armando Solar-Lezama. Msl: A synthesis enabled language for distributed implementations. In Proceedings of the International Conference for High Perfor- mance Computing, Networking, Storage and Analysis, SC '14, pages 311{322, Piscataway, NJ, USA, 2014. IEEE Press. [26] F. Augustin and Y. ~ M. Marzouk. Uncertainty quantification in high performance computing (invited position paper). SIGPLAN Workshop on Probabilistic and Approximate Computing (APPROX), 2014. [27] David Grove, Josh Milthorpe, and Olivier Tardieu. Supporting array programming in X10. In Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, ARRAY'14, pages 38:38{38:43, New York, NY, USA, 2014. ACM. [28] Wei Zhang, Olivier Tardieu, David Grove, Benjamin Herta, Tomio Kamada, Vijay Saraswat, and Mikio Takeuchi. GLB: Lifeline-based global load balancing library in X10. In Proceedings of the First Workshop on Parallel Programming for Analytics Applications, PPAA '14, pages 31{40, New York, NY, USA, 2014. ACM. [29] Olivier Tardieu, David Grove, Benjamin Herta, Tomio Kamada, Vijay Saraswat, Mikio Takeuchi, and Wei Zhang. X10 for Productivity and Performance at Scale: A Submission to the 2013 HPC Class II Challenge, October 2013. [30] Craig Rasmussen, Matthew Sottile, Daniel Nagle, and Soren Rasmussen. Locally-oriented programming: A simple programming model for stencil-based computations on multi-level distributed memory architectures. In Proceedings of Euro-Par 2015 Parallel Processing, Lecture Notes in Computer Science. Springer International Publishing, 2015. Submitted, February 2015. [31] Thomas Steel Henretty. Performance Optimization of Stencil Computations on Modern SIMD Architectures. PhD thesis, The Ohio State University, 2014. [32] Justin Andrew Holewinski. Automatic Code Generation for Stencil Computations on GPU Architectures. PhD thesis, The Ohio State University, 2012. [33] Mahesh Ravishankar. Automatic parallelization of loops with data dependent control ow and array access patterns. PhD thesis, The Ohio State University, 2014. [34] Kevin Alan Stock. Vectorization and Register Reuse in High Performance Computing. PhD thesis, The Ohio State University, 2014. [35] Tom Henretty, Richard Veras, Franz Franchetti, Louis-Noel Pouchet, J. Ramanujam, and P. Sadayappan. A stencil compiler for short-vector simd architectures. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS '13, pages 13{24, New York, NY, USA, 2013. ACM. [36] Justin Holewinski, Louis-Noel Pouchet, and P. Sadayappan. High-performance code generation for stencil computations on gpu architectures. In Proceedings of the 26th ACM International Conference on Supercomputing, ICS '12, pages 311{320, New York, NY, USA, 2012. ACM. [37] Louis-Noel Pouchet, Peng Zhang, P. Sadayappan, and Jason Cong. Polyhedral-based data reuse optimization for configurable computing. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA '13, pages 29{38, New York, NY, USA, 2013. ACM. [38] S. Rajbhandari, A. Nikam, Pai-Wei Lai, K. Stock, S. Krishnamoorthy, and P. Sadayappan. Cast: Contraction algorithm for symmetric tensors. In Parallel Processing (ICPP), 2014 43rd International Conference on, pages 261{272, Sept 2014. [39] Samyam Rajbhandari, Akshay Nikam, Pai-Wei Lai, Kevin Stock, Sriram Krishnamoorthy, and P. Sadayappan. A communication-optimal framework for contracting distributed tensors. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '14, pages 375{386, Piscataway, NJ, USA, 2014. IEEE Press. [40] Mahesh Ravishankar, John Eisenlohr, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan. Code generation for parallel execution of a class of irregular loops on distributed memory systems. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, pages 72:1{72:11, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press. [41] Mahesh Ravishankar, John Eisenlohr, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan. Automatic parallelization of a class of irregular loops for distributed memory systems. ACM Transactions on Parallel Computing, 1(1):7:1{7:37, September 2014. [42] Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noel Pouchet, J Ramanujam, Atanas Rountev, and P Sadayappan. Distributed memory code generation for mixed irregular/regular computations. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 65{75. ACM, 2015. [43] Kevin Stock, Martin Kong, Tobias Grosser, Louis-Noel Pouchet, Fabrice Rastello, J. Ramanujam, and P. Sadayappan. A framework for enhancing data reuse via associative reordering. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, pages 65{76, New York, NY, USA, 2014. ACM. [44] Venmugil Elango, Fabrice Rastello, Louis-No?el Pouchet, J. Ramanujam, and P. Sadayappan. On characterizing the data movement complexity of computational dags for parallel execution. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '14, pages 296{306, New York, NY, USA, 2014. ACM. [45] Naznin Fauzia, Venmugil Elango, Mahesh Ravishankar, J. Ramanujam, Fabrice Rastello, Atanas Rountev, Louis-Noel Pouchet, and P. Sadayappan. Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential. ACM Trans. Archit. Code Optim., 10(4):53:1{53:29, Dec. 2013. [46] Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-Noel Pouchet, and P. Sadayappan. When polyhedral transformations meet simd code generation. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '13, pages 127{138, New York, NY, USA, 2013. ACM. [47] Lai Wei and John Mellor-Crummey. Autotuning tensor transposition. In Proceedings of the 19th International Workshop on High-level Parallel Programming Models and Supportive Envi- ronments, May 2014. [48] Vivek Sarkar. Jun Shirako, Louis-Noel Pouchet. Oil and water can mix: An integration of polyhedral and ast-based transformations. In IEEE Conference on High Performance Computing, Networking, Storage and Analysis (SC'14). IEEE, 2014. [49] Vivek Sarkar. Prasanth Chatarasi, Jun Shirako. Polyhedral transformations of explicitly parallel programs. In 5th International Workshop on Polyhedral Compilation Techniques (IMPACT 2015). IEEE, 2015. [50] Kamal Sharma. Locality Transformations of Computation and Data for Portable Performance. PhD thesis, Rice University, August 2014. [51] Jun Shirako and Vivek Sarkar. Oil and water can mix! Experiences with integrating polyhedral and AST-based Transformations. In 17th Workshop on Compilers for Parallel Programming, July 2013. [52] Jisheng Zhao, Michael Burke, and Vivek Sarkar. Rice ROSE Compositional Analysis and Transformation Framework (R2CAT). Technical report, LLNL Technical Report 590233, October 2012. [53] Phitchaya Mangpo Phothilimthana, Tikhon Jelvis, Rohin Shah, Nishant Totla, Sarah Chasins, and Rastislav Bodik. Chlorophyll: synthesis-aided compiler for low-power spatial architectures. In O'Boyle and Pingali [58], page 42. [54] Emina Torlak and Rastislav Bodik. A lightweight symbolic virtual machine for solver-aided host languages. In O'Boyle and Pingali [58], page 54. [55] Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. Syntax-guided synthesis. In Formal Methods in Computer-Aided Design, FMCAD 2013, Port- land, OR, USA, October 20-23, 2013, pages 1{8. IEEE, 2013. [56] Emina Torlak and Rastislav Bodik. Growing solver-aided languages with rosette. In Antony L. Hosking, Patrick Th. Eugster, and Robert Hirschfeld, editors, ACM Symposium on New Ideas in Programming and Re ections on Software, Onward! 2013, part of SPLASH '13, Indianapolis, IN, USA, October 26-31, 2013, pages 135{152. ACM, 2013. [57] Leo A. Meyerovich, Matthew E. Torok, Eric Atkinson, and Rastislav Bodik. Parallel schedule synthesis for attribute grammars. In Alex Nicolau, Xiaowei Shen, Saman P. Amarasinghe, and Richard W. Vuduc, editors, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '13, Shenzhen, China, February 23-27, 2013, pages 187{196. ACM, 2013. [58] Michael F. P. O'Boyle and Keshav Pingali, editors. ACM SIGPLAN Conference on Program- ming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom - June 09 - 11, 2014. ACM, 2014. [59] Tan Nguyen, Pietro Cicotti, Eric Bylaska, Dan Quinlan, and Scott B. Baden. Bamboo: translating mpi applications to a latency-tolerant, data-driven form. In Proceedings of the Interna- tional Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, pages 39:1{39:11, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press. [60] Tan Nguyen and Scott B. Baden. Bamboo-preliminary scaling results on multiple hybrid nodes of knights corner and sandy bridge processors. In Proc. WOLFHPC: Workshop on Domain- Specific Languages and High-Level Frameworks for HPC, SC13, The International Conference for High Performance Computing, Networking, Storage and Analysis, Denver CO, 2013.

X-Stack Project Publications

From Modelado Foundation

Revision as of 20:42, June 18, 2015 by imported>ChunhuaLiao (→D-TEC: DSL Technology for Exascale Computing)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

D-TEC: DSL Technology for Exascale Computing

X-Stack Project Publications

From Modelado Foundation

Revision as of 20:42, June 18, 2015 by imported>ChunhuaLiao (→D-­TEC: DSL Technology for Exascale Computing)(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

D-­TEC: DSL Technology for Exascale Computing

Revision as of 20:42, June 18, 2015 by imported>ChunhuaLiao (→D-TEC: DSL Technology for Exascale Computing)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

D-TEC: DSL Technology for Exascale Computing