Better Asynchronous Branch Prediction

J. Garside

Suitable for: ACS, CS

Modern microprocessors rely heavily on pipelined design to increase their instruction throughput. This means that instructions are fetched from memory some time before they are executed, and there may be several of these prefetched instructions "on the way" at any time.

When the programme branches instructions prefetched beyond the branch must be discarded and the pipeline refilled from the new stream. This takes extra time and energy has been wasted speculating on instructions which were not wanted. This is a Bad Thing.

As instruction pipelines deepen and widen (for example in superscalar architectures where several instructions are fetched and executed simultaneously) the number of instructions which must be discarded increases. In 'typical' code a branch occurs - on average - about every six instructions, thus the penalty for fetching another (say) 6 speculative instructions is severe!

To alleviate this, branches may be predicted. Branch prediction may reduce or eliminate this overhead when it is successful, and modern branch prediction algorithms are becoming very effective.

However these have so far only been implemented in synchronous systems. The AMULET group is interested in asynchronous systems, where several of the implementation problems are different. Although the latest processor - AMULET2e - includes some branch prediction this is only about 70% successful.

The intent of this project is to investigate what is available in the synchronous world and adapt this for an asynchronous implementation. In case that is not enough, because the AMULET processors are low power consumption devices, the power budget for the resulting unit must also be minimised.