High Performance Optimizations in Runtime Speculative Parallelization for Multicore Architectures

Yiapanis, Paraskevas

Abstract

Thread-Level Speculation (TLS) overcomes limitations intrinsic with conservative compile-time auto-parallelizing tools by extracting parallel threads optimistically and only ensuring absence of data dependence violations at runtime. A significant barrier for adopting TLS (implemented in software) is the overheads associated with maintaining speculative state. Previous TLS limit studies observe that on future multi-core systems it is likely to have more cores idle than those which traditional TLS would be able to harness. This thesis describes a novel compact version management data structure optimized for space overhead when using a small number of TLS threads. Furthermore, two novel software runtime parallelization systems were developed that utilize this compact data structure. The first one, MiniTLS, is optimized for fast recovery in the case of misspeculations by parallelizing the recovery procedure. The second one, Lector, is optimized for performance by using lightweight helper threads, along with TLS threads, to establish whether speculation can be withdrawn avoiding that way any speculative overheads. Facilitated by the novel compact representation, MiniTLS reduces the space overhead over state-of-the-art software TLS systems between 96% on 2 threads and 40% on 32 threads. MiniTLS and Lector were applied to seven Java benchmarks performing on average 7x and 8.2x faster, respectively, against the sequential versions and on average 1.7x faster than the current state-of-the-art in software TLS for 32 threads.

The thesis is available as PDF (**MB).

| Skip to main content | Skip to navigation | Skip to Search |