Advanced Backend Code Optimization
Publication Date: May 2014 Hardback 384 pp.
A summary of more than a decade of research in the area of backend code optimization for high performance and embedded computing, this book contains the latest fundamental and technical research results in this field at an advanced level.
With chapters on phase ordering in optimizing compilation, register saturation in instruction level parallelism, code size reduction for software pipelining, memory hierarchy effects in instruction-level parallelism, and rigorous statistical performance analysis, it covers material not previously covered by books in the field. Other chapters provide the latest research results in well-known topics such as instruction scheduling and its relationship with machine scheduling theory, register need, software pipelining and periodic register allocation.
As such, Advanced Backend Code Optimization is particularly appropriate for researchers, professors and high-level Master’s students in computer science, as well as computer science engineers.
Part 1. Prolog: Optimizing Compilation
1. On the Decidability of Phase Ordering in Optimizing Compilation.
Part 2. Instruction Scheduling
2. Instruction Scheduling Problems and Overview.
3. Applications of Machine Scheduling to Instruction Scheduling.
4. Instruction Scheduling Before Register Allocation.
5. Instruction Scheduling After Register Allocation.
6. Dealing in Practice with Memory Hierarchy Effects and Instruction Level Parallelism.
Part 3. Register Optimization
7. The Register Need of a Fixed Instruction Schedule.
8. The Register Saturation.
9. Spill Code Reduction.
10. Exploiting the Register Access Delays Before Instruction Scheduling.
11. Loop Unrolling Degree Minimization for Periodic Register Allocation.
Part 4. Epilog: Performance, Open Problems
12. Statistical Performance Analysis: The Speedup-Test Protocol.
Appendix 1. Presentation of the Benchmarks Used in our Experiments
Appendix 2. Register Saturation Computation on Stand-Alone DDG
Appendix 3. Efficiency of SIRA on the Benchmarks
Appendix 4. Efficiency of Non-Positive Circuit Elimination in the SIRA Framework
Appendix 5. Loop Unroll Degree Minimization: Experimental Results
Appendix 6. Experimental Efficiency of Software Data Preloading and Prefetching for Embedded VLIW
Appendix 7. Appendix of the Speedup-Test Protocol
About the Authors
Sid Touati is currently Professor at University Nice Sophia Antipolis in France. His research interests include code optimization and analysis for high performance and embedded processors, compilation and code generation, parallelism, statistics and performance optimization. His research activities are conducted at the Institut National de Recherche en Informatique et Automatisme (INRIA) as well as at the Centre National de Recherche Scientifique (CNRS).
Benoit Dupont de Dinechin is currently a Chief Technology Officer for Kalray in France. He was formerly a researcher and engineer at ST-microelectronics in the field of backend code optimization in the advanced compilation team. He has a PhD in computer science, in the subject area of instruction scheduling for instruction level parallelism, and a computer engineering diploma.