Go to main content

School of Computer Science Intranet

APT research areas

Discover our main research areas

Adaptive Parallelization and Optimization for the Jamaica Chip Multi-Processor Architecture

Jisheng Zhao


Chip Multi-Processor (CMP) systems are now very popular. This trend to have multi-core and multi-threading makes the system increasingly difficult to target. Also, the lack of runtime information stretches the compiler's abilities to make accurate performance predictions. So how can sequential applications bene t from the ubiquitous CMP? A good choice is a dynamic execution environment that automatically parallelizes programs and adaptively optimizes the code at runtime. This work investigates how adaptive parallelization and optimization, directed by hardware feedback, improves runtime performance of sequential applications. A Java Virtual Machine based, fully-runtime, parallelization and optimization system is built and evaluated on top of a particular CMP architecture, the JAMAICA CMP, which provides fine-grain parallelism support. This runtime system performs loop-level parallelization for both the normal CMP system and the CMP system with thread level speculation (TLS) support. The developed adaptive optimizations are performed by an online tuning system which tunes parallelized loops adaptively, driven by runtime feedback. These adaptive optimizations concentrate on improving the load balance and data locality for the normal CMP system, and finding the best decomposition to reduce the runtime overhead for the CMP with TLS support. The evaluation is based on a cycle-level simulation system, which can be easily configured as different hardware configurations. Experiments show that this purely runtime adaptive system is capable of parallelizing and tuning standard benchmarks and achieving performance improvements as much as 12.5% compared with the scheme used by static compilation with parallelization. By evaluating this system with various hardware cofigurations, good scalability is demonstrated which means that the applications can be well adapted to different hardware configurations and achieve good performance.

The thesis is available as PDF (2.2MB).