4th HiPEAC Industrial Workshop on Compilers and Architectures at Robinson College, Cambridge
November 26, 2007
Organized by ARM Ltd. in Cambridge, UK
Call for Papers
Find the call for papers here.
Program and Presentations
Topics
The main focus of this workshop is advanced embedded computer architecture and compiler technology. The topics of interest for this workshop include, but are not limited to:- Modern embedded architectures
- High-performance low-power architectures
- Ultra Low Power Circuit and Microarchitecture Design Techniques
- Reliability and Fault Tolerance
- Symmetric/Asymmetric Multicore, multithreading, superscalar, and VLIW architectures
- Reconfigurable and soft-core computing
- Compilers and programming tools for modern embedded systems
- Dynamic translation and optimization
- Parallel programming and concurrency support for Multicore/multithreaded systems
- Performance tools for embedded systems
- Non-traditional embedded computing systems topics
Workshop Program
| 08:30 | Arrival + Registration |
| 09:00 - 9:10 | Opening |
SESSION 1: Binary, Compiler and Memory Optimization for Embedded Systems | |
| 09:10 - 9:35 | Benoît Dupont de Dinechin, STMicroelectronics |
| 09:35 - 10:00 | Dominique Chanet, Jonas Maebe and Koen De Bosschere, Ghent University |
| 10:00 – 10:25 | Peter Marwedel, Heiko Falk, Sascha Plazar, Robert Pyka and Lars Wehmeyer, University of Dortmund and Informatik Centrum Dortmund (ICD) |
| 10:25 – 10:50 | Benedict R. Gaster, Clearspeed Tech. |
| 10:50 – 11:20 | COFFEE BREAK |
SESSION 2: Language and Tool Support for Multicore Architectures | |
| 11:20 – 11:45 | Philippe Bonnot, Sami Yehia, Arnaud Grasset, Eric Lenormand and Gilbert Edelin, Thales Research and Technology |
| 11:45 – 12:10 | Marina Biberstein, Moon S. Chang, Bilha Mendelson, Uzi Shvadron and Javier Turek, IBM Haifa |
| 12:10 – 12:35 | Alastair Donaldson, Colin Riley, Anton Lokhmotov and Andrew Cook, Codeplay Software and University of Cambridge |
| 12:35 – 13:50 | LUNCH + POSTER SESSION |
| 13:50 - 14:35 | Keynote Speech Krisztian Flautner, Director of Research, ARM |
SESSION 3: Dependable Computing | |
| 14:35 – 15:00 | Ricardo Fernandez-Pascual, Jose M. Garcia, Manuel E. Acacio and Jose Duato, University of Murcia and University of Valencia |
| 15:00 – 15:25 | Veerle Desmet, Yiannakis Sazeides and Costas Vrioni, Ghent University and University of Cyprus |
| 15:25 – 15:50 | COFFEE BREAK |
SESSION 4: Modeling and Simulation | |
| 15:50 – 16:15 | Sanjay Jinturkar, Vitaly Kalashnikov, Mayan Moudgill, Gary Nacer and John Glossner, Sandbridge Technologies |
| 16:15 – 16:40 | Stefan Kraemer, Lei Gao, Rainer Leupers, Gerd Ascheid and Heinrich Meyr, RWTH-Aachen University |
| 16:40 – 17:05 | Veerle Desmet, Grigori Fursin, Sylvain Girbal and Olivier Temam, Ghent University and INRIA |
| 17:05 – 17:25 | Coffee Break |
SESSION 5: Relevant EU Projects | |
| 17:25 – 17:40 | Mike O’Boyle, University of Edinburgh |
| 17:40 – 17:55 | Georgi N. Gaydadjiev, Delft University of Technology |
| 17:55 - 18:00 | Closing |
POSTER SESSION details
Towards an Energy Efficient Branch Prediction Scheme Using Profiling
Michael Hicks, Colin Egan, Bruce Christianson and Patrick Quick, University of Hertfordshire, UK
Abstract: Dynamic branch predictors account for between 10% and 40% of a processor’s dynamic power consumption. This power cost is proportional to the number of accesses made to that dynamic predictor during a program’s execution. In this paper we propose the combined use of local delay region scheduling and profiling with an original adaptive branch bias measurement. The adaptive branch bias measurement takes note of the dynamic predictor’s accuracy for a given branch and decides whether or not to assign a static prediction for that branch. The static prediction and local delay region scheduling information is represented as two hint bits in branch instructions. We show that, with the combined use of these two methods, the number of dynamic branch predictor accesses/updates can be reduced by up to 62%. The associated average power saving is very encouraging; for the example high-performance embedded architecture n average global processor power saving of 6.22% is achieved. |
The ARISE Framework: Extending Processors with Arbitrary Hardware
Accelerators
Nikolaos Vassiliadis, George Theodoridis, and Spiridon Nikolaidis, Aristotle University of Thessaloniki, Greece
Abstract: ARISE introduces a systematic approach for extending once a processor to support thereafter the coupling of an arbitrary number of Custom Computing Units (CCUs). A CCU can be hardwired or reconfigurable unit, while it can be utilized following a hybrid, tight and/or loose, model of computation. By selecting the appropriate model of computation for each part of the application, the complete application space can be considered for acceleration, resulting to significant increase of performance improvements. To support these features we introduce a machine organization that allows the co-operation of a processor and a set of CCUs. To control the CCUs the instruction set of the processor is extended with eight instructions. To efficiently incorporate these features to an embedded processor, a micro-architecture implementation that minimizes the control and communication overhead between the processor and the CCUs is introduced. To evaluate our proposal we have extended a MIPS processor with the ARISE infrastructure and implemented it on a Xilinx FPGA and proved that the timing model of the processor is not affected. A set of benchmarks were implemented on the ARISE evaluation machine. Performance results prove that exploiting the hybrid model of computation, the ARISE machine achieves performance improvements of up to 68% compared to a typical approach. |
Light-Weight SIMD
Extension for Embedded Processors
Magnus Sjalander and Per Larsson-Edefors, Chalmers University of Technology,
Sweden
Abstract: We present a light-weight SIMD extension for embedded general-purpose processors with negligible impact on delay and power dissipation. This is achieved by modifying existing functional units, such that they support multiple-precision operations and by limiting the number of added SIMD instructions. Particularly, a twin-precision multiplier is utilized to give support for low-overhead SIMD multiplications. A MIPS-R2000-like processor is extended with the proposed light-weight SIMD support, and the performance estimates from placed-and-routed layouts in a 0.13-μm technology are subsequently analyzed. A SIMDenabled version of the EEMBC’s FFT benchmark shows that on top of a dramatically reduced memory access activity, the total execution time and total energy is reduced by 15% and 14%, respectively. |
Filtering drowsy caches to improve
their performance
Paolo Bennati and Roberto Giorgi, University of Siena, Italy
Abstract: Leakage power in data cache memories represents a sizable fraction of total power consumption, and many techniques have been proposed to reduce it. As a matter of fact, during a fixed period of time, only a small subset of cache lines is used. Drowsy technique, for instance, put unused lines to drowsy state in order to save power. Our idea is to adaptively select mostly used cache lines in order to maintain mostly used data always available. We found that this can be achieved automatically by using a tiny cache acting as a filter “L0†cache. Our main contributions are: i) evaluation of filter cache to reduce leakage; ii) improvement of an existing power-saving techniques. Our experiments, with complete MiBench suite for ARM based processor, show (in average) 10% improvement in leakage saving and 17% in leakage energy-delay versus drowsy-cache. |
Automatic Parallelization in GCC
Razya Ladelsky, IBM Haifa, Israel
Abstract: With the emergence of multicore architectures there is a growing need for automatic parallelization, that distributes sequential code into multi threaded code. OpenMP defines language extensions to C, C++, and Fortran for implementing multi-threaded shared memory applications. Generation of such extensions by the compiler relieves programmers from the manual parallelization process. OpenMP specification has been implemented in GCC and integrated into version 4.2. The OpenMP infrastructure together with existent data dependence analyses served as the basic infrastructure for an automatic parallelization optimization implementation recently in GCC. The initial automatic parallelization work was contributed by Sebastian Pop and Zdenek Dvorak, and supports loops whose iterations are independent of each other. We later enhanced these capabilities to support loops with reduction dependence among the iterations, thereby parallelizing additional loops. These auto-parallelization contributions are being incorporated into the upcoming version 4.3. In this talk we summarize the existing OpenMP and data dependence infrastructures in GCC, then describe the current state of automatic parallelization in GCC, demonstrated by some examples. Finally, we discuss future directions of work that may further extend the optimization's applicability. |
Workshop Registration
The registration website is open for Hipeac members at Online Registration.
For people who are not Hipeac members, please click here to register.
Practical Information
- Venue location: Robinson College
- The workshop takes place in the main auditorium: Auditorium Theatre
- Map of Cambridge and Robinson College (Larger version)
- Visa Information
- Travel info
How to get to Cambridge by air:
- London Stansted Airport,
a direct train connect Stansted Airport (SSD) to Cambridge (CBG) every hour,
use National Rail Timetables and Journey Planner - London Heathrow Airport,
trains run form Heathrow Terminals (find information about the terminals here) via London to Cambridge (CBG),
use National Rail Timetables and Journey Planner - London Gatwick Airport,
trains run form Gatwick Airport (GTW) via London to Cambridge (CBG),
use National Rail Timetables and Journey Planner - and London Luton airport,
trains run form Luton Airport Parkway (LTN) via London to Cambridge (CBG),
use National Rail Timetables and Journey Planner
- London Stansted Airport,
- Social Program
- Local Accommodations
Please email to Emre Ozer for an invitation letter for Visa applications
