Computational & Technology Resources
an online resource for computational,
engineering & technology publications
Computational Science, Engineering & Technology Series
PATTERNS FOR PARALLEL PROGRAMMING ON GPUS
Edited by: F. Magoulès
Program Sequentially, Carefully, and Benefit from Compiler Advances for Parallel Heterogeneous Computing
M. Amini1, C. Ancourt2, B. Creusillet3, F. Irigoin2 and R. Keryell1
1SILKAN Inc., Los Altos CA, USA
M. Amini, C. Ancourt, B. Creusillet, F. Irigoin, R. Keryell, "Program Sequentially, Carefully, and Benefit from Compiler Advances for Parallel Heterogeneous Computing", in F. Magoulès, (Editor), "Patterns for Parallel Programming on GPUs", Saxe-Coburg Publications, Stirlingshire, UK, Chapter 6, pp 149-169, 2014. doi:10.4203/csets.34.6
Keywords: parallel programming, automatic parallelization, coding rules.
The current microarchitecture trend leads toward heterogeneity. This evolution is driven by the end of Moore's law and the frequency wall due to the power wall. Moreover, with the spreading of the smartphone, some constraints from the mobile world drive the design of most new architectures. An immediate consequence is that an application has to be executable on various targets.
Porting and maintaining multiple versions of the code base requires different skills and the efforts required in the process as well as the increased complexity in debugging and testing are time consuming, thus expensive.
Some solutions based on compilers emerge. They are based either on directives added to C like in OpenHMPP or OpenACC or on an automatic solution like PoCC, Pluto, PPCG, or PAR4ALL. However compilers cannot retarget any program written in a low-level language such as unconstrained C in an efficient way. Programmers should follow good practices when writing code so that compilers have more room to perform the transformations required for efficient execution on heterogeneous targets.
This chapter explores the impact of different patterns used by programmers, and defines a set of good practices allowing a compiler to generate efficient code.
purchase the full-text of this chapter (price £20)