Multicore Programming Models and their Compilation Challenges
Vivek Sarkar, Rice University
Abstract
The computer industry is at a major inflection point in its hardware roadmap due to the end of a decades-long trend of exponentially increasing clock frequencies. It is widely agreed that spatial parallelism in the form of multiple power-efficient cores must be exploited to compensate for this lack of frequency scaling. Unlike previous generations of hardware evolution, this shift towards homogeneous and heterogeneous manycore computing will have a profound impact on software. Two complementary compiler approaches to address this problem are 1) compilation and optimization of explicitly parallel programs, and 2) automatic extraction of parallelism from sequential programs. This course addresses the first approach, whereas the second approach is addressed in the course titled "Compilation for Multicore Processors" by Prof. Scott Mahlke.
In this course, we will start with a brief overview of modern programming models for multicore processors including Cilk, CUDA, Java threads, and OpenMP 3.0. Our focus on these programming models will be from the compiler viewpoint, and we will identify a common set of primitives that are suitable for use in parallel intermediate representations (PIRs) for multicore programs. These primitives (async, finish, isolated, phasers, places) are derived from the X10 language and are directly embodied in the pedagogical Habanero-Java (HJ) language developed at Rice University.
The remainder of the course focuses on compilation challenges for parallel programs at the PIR level. The historical foundations of code optimization including intermediate representations, data flow analyses, and optimizing transformations are all deeply entrenched in the von Neumann model of sequential computing, and have to be reworked for parallelism. We summarize the state of the art in analysis and optimization of parallel programs by covering the following topics:
- Intermediate Representations for Parallel Programs
- Data Flow Analysis frameworks for Parallel Programs
- Memory Models and their impact on Code Optimization
- Privatization and Escape Analyses
- Optimization of Task Granularity and Synchronization
Bio

