Advanced Processor Technologies Home
APT Advanced Processor Technologies Research Group

An Asynchronous SDM Network-on-Chip Tolerating Permanent Faults

G Zhang, W Song, J Garside, J Navaridas and Z Wang

Abstract

Asynchronous circuits have been used to implement Networks-on-Chip (NoCs), resulting in asynchronous NoCs where the links are usually implemented as quasi-delay-insensitive (QDI) pipelines to tolerate delay variations. With the ageing process of circuits, permanent faults may happen on links at runtime, causing both data errors and deadlocks of the network. This paper presents an asynchronous Spatial Division Multiplexing (SDM) NoC which tolerates permanent faults on the QDI links. Using a time-out mechanism, a general fault-detection technique can locate the permanent fault in the deadlocked NoC. To recover the network, a Drain&Release technique releases fault-free network resources on the deadlocked path. The SDM NoC physically divides every link and buffer into multiple independent virtual circuits. By configuring the switch allocator, the faulty virtual circuit is blocked so that it will not be allocated to any packets. The succeeding traffic requesting the same link will go through other fault-free virtual circuits and the network function is recovered. With regard to intermittent faults, the previously blocked virtual circuit can be resumed when the fault disappears. Experimental results show the asynchronous SDM NoC can detect and recover from permanent faults with reasonable overhead.

IEEE Copyright