Advanced Processor Technologies Home
APT Advanced Processor Technologies Research Group

Transient fault tolerant QDI interconnects using redundant check code.

Guangda Zhang, Wei Song, Jim Garside, Javier Navaridas and Zhiying Wang.

Abstract

Asynchronous circuit is a promising candidate for building the chip-level interconnect of multi-core systems. However, asynchronous circuits are vulnerable to faults. This paper presents a novel scheme to improve the robustness of asynchronous systems. Our first contribution is a fault tolerant delay-insensitive redundant check coding scheme named DIRC. Using DIRC in 4-phase 1-of-n quasi-delay-insensitive (QDI) interconnects, all 1-bit and some multi-bit transient faults can be tolerated. The DIRC and the basic 4-phase 1-of-n pipeline stages are mutually exchangeable so that arbitrary basic stages can be replaced by DIRC stages to strengthen the fault-tolerance of long wires. Our second contribution, RPA, is a redundant technique to protect the acknowledge wires from transient faults - an issue that has long been disregarded by the community. The DIRC pipelines (using DIRC plus RPA) are compared with the basic pipelines using the UMC 0.13μm standard cell library. Detailed experimental results show that the 128-bit DIRC 1-of-4 pipeline is only 13% slower than the basic one. The fault-tolerant capability increases hundred-folds even when multi-bit transient faults are considered.