# An Asynchronous Ternary Logic Signaling System

Tomaz Felicijan and Steve B. Furber, Senior Member, IEEE

Abstract—This paper presents a new approach to an on-chip asynchronous transmission system suitable for next generation asynchronous on-chip networks. It implements multivalued logic to reduce the number of wires and a low-voltage swing for lower dynamic power dissipation. Furthermore, the transmission system described here enjoys fully static design and has zero static power consumption. Two versions of the transmitter circuit and the receiver are described. The proposed signaling scheme is compared to a classical dual-rail signaling system with regard to speed, power consumption, and reliability. The simulation results show that the asynchronous ternary logic signaling (ATLS) system delivers over 70% higher bandwidth per wire and consumes over 50% less power than the dual-rail signaling system on 10-mm-long on-chip interconnection.

*Index Terms*—Communication system signaling, digital CMOS, low-power design, low voltage, multivalued logic.

#### I. INTRODUCTION

T HE REDUCTION of dynamic power dissipation in VLSI applications is a major challenge for today's engineers. In modern VLSI systems, a large proportion of power is consumed by interconnect [1]. One way to reduce the power consumption related to the transmission system is to reduce the voltage swing.

Asynchronous circuits generally consist of many small state machines that communicate with each other through handshaking protocols. Although self-timed circuits have several advantages over clocked ones, one major drawback, especially for delay-insensitive circuits, is an increase in circuit size. Large numbers of communication wires make routing nontrivial on-chip communication networks a very demanding and time consuming task. This problem becomes more and more important as the integration level increases. One solution is the use of multivalued logic.

The research presented in this paper aims to attack both problems. We have developed an *asynchronous ternary logic signaling* (ATLS) system, which utilizes a reduced voltage swing for lower dynamic power consumption and multivalued logic for reducing the number of wires. We compare our system to a classical dual-rail signaling scheme with regard to delay, power consumption and reliability.

#### A. Dual-Rail Signaling System

Fig. 1 shows the classical dual-rail signaling system that implements a four-phase handshaking protocol [2]. It uses two data wires per bit of information, one wire for signaling logic 1, the

Manuscript received October 7, 2002; revised January 15, 2003. This work was supported by EPSRC under Grant GR/R47363/01.

The authors are with the Department of Computer Science, University of Manchester, M13 9PL Manchester, U.K. (e-mail: felicijt@cs.man.ac.uk).

Digital Object Identifier 10.1109/TVLSI.2003.819571

d.t.d.f Empty ("E' 0 0 eceiver sender Valid "0 0 1 Valid "1" 1 0 Not used Data {d.t, d.f} Empty, Valid D Empty D (Valid Ack

Fig. 1. Dual-rail four-phase protocol.

other wire for signaling logic 0. The request signal in any handshake cycle can be either of those two wires.

At the start of the handshake cycle, the sender issues a valid codeword by setting one of the two data wires to logic 1. The receiver absorbs the codeword and sets the acknowledge signal high. The sender responds to this acknowledgement by resetting the data wire to logic 0. When the receiver detects the empty codeword it returns the acknowledge signal back to logic 0 as shown in Fig. 1. At this point the sender can initiate a new communication cycle.

While fairly simple, dual-rail circuits have one major drawback: the large numbers of wires. To transmit n data bits in parallel, 2n + 1 wires have to be routed. One way to reduce this drawback is by implementing a more efficient delay insensitive encoding scheme such as *N*-of-*M* code. While this could introduce higher bandwidth per wire, it would considerably increase the complexity of completion detection circuitry and consequently reduce bandwidth.

#### B. Ternary Logic

Ternary logic has been a subject of research for many years, but a real-life VLSI application implementing an additional logic level is yet to be designed. Although many ternary logic models exist in the literature, they all suffer from drawbacks. Either they involve high power consumption [3], depend on customized technological processes [4], or implement multithreshold devices [5]. Many of the proposed ternary logic circuits use dynamic logic and consume static power [6], [7].

The research presented in this paper focuses merely on implementing ternary logic in the transmission system and it does not include the design of ternary logic gates. Our circuits are based on static CMOS design and have zero static power consumption.

### II. ATLS SYSTEM

The main idea of the ATLS system is to enable the delayinsensitive transmission of one bit of information over a single wire (plus an acknowledge wire). Fig. 2 shows the principle of the system. When the communication channel is in the idle state, the voltage level on the wire is held at  $V_{dd}/2$ . To transmit a



Fig. 2. Principle of the ATLS system.



Fig. 3. ATLS system transmitter.

symbol we have to pull the voltage level to one of the rails ( $V_{dd}$  for logic 1 or  $V_{ss}$  for logic 0). If the communication protocol uses four-phase (return to zero) handshaking, the voltage level on the wire is always switching with a reduced swing of  $V_{dd}/2$ .

If the half-swing interconnect lines are high-capacitance, high-activity lines, then the power saving can be significant. For example, the power dissipation to drive the line with a full swing each cycle is given by

$$P_{\rm dyn} = C \cdot V_{dd}^2 \cdot f \tag{1}$$

where C is the load capacitance and f is the frequency of switching. This is actually the power consumed by a dual-rail signaling system transmitting one bit of information (ignoring the acknowledge signal). Note that power is consumed only on one wire, since only one wire is active during one transmission cycle. If the voltage swing is reduced to  $V_{dd}/2$ , as with the ATLS system, then the power dissipation equals

$$P_{\rm dyn} = C \cdot (V_{dd}/2)^2 \cdot f. \tag{2}$$

Thus, ignoring the power dissipation of the transmitter and the receiver, the potential power saving of the ATLS system over the dual-rail signaling system is 75% and, since the ATLS system transmits one bit of information on a single wire, it potentially has 100% higher bandwidth per wire (ignoring the delay of the receiver). Note that this is true only when the switching frequency of both systems is the same and the acknowledge signal is ignored.

#### A. ATLS System Transmitter

We propose two variants of the ATLS system transmitter. The first is a simple driver with an additional transistor (M3) for switching the output voltage to the middle rail  $(V_{dd}/2)$ , as shown in Fig. 3. The input of the driver is fed with dual-rail signals and we assume that a  $V_{dd}/2$  supply voltage is provided.

When switching from  $V_{ss}$  to  $V_{dd}/2$  transistor M3 has a full drive voltage applied at the gate so it can operate at full speed,



Fig. 4. Output waveforms of basic (upper graph) and enhanced (lower graph) ATLS system.

while when switching from  $V_{dd}$  to  $V_{dd}/2$  only half the drive voltage is driving the transistor. Thus, to ensure reasonably fast transitions from  $V_{dd}$  to  $V_{dd}/2$  transistor M3 has to be relatively large. We suggest that transistor M3 should be the same size as the pMOS transistor M1.

The upper graph in Fig. 4 shows the output waveforms of the ATLS system. Waveforms WDI and WDO present the voltage swing at the output of the transmitter and at the input of the receiver respectively. It is clear that the falling edge of the high-swing transition (from  $V_{dd}$  to  $V_{dd}/2$ ) is the slowest transition in the system. This slows down the propagation of the empty codeword following the transmission of a logic 1 symbol. Note that waveforms INHC and INLC correspond to the inputs and waveforms OUTH and OUTL to the outputs of the system.

Furthermore, the transmitter circuit exhibits shoot-through currents. When InL rises, M2 and M3 will fight until the NOR gate switches and turns off transistor M3. This behavior introduces some additional power dissipation which depends upon the speed of the NOR gate and the sizes of transistors M2 and M3.

### B. ATLS System Receiver

The receiver consists of two level shifters: one that converts low half-swing transitions (from  $V_{ss}$  to  $V_{dd}/2$  and back) to fullswing transitions, and a second which converts high half-swing transitions (from  $V_{dd}/2$  to  $V_{dd}$  and back) to full-swing transitions. Fig. 5 shows the receiver circuit. The input is driven with ternary logic signals and the circuit produces full-swing dual-rail signals at the outputs. Note that both inverters are powered with a half  $V_{dd}$  supply but with different ground references.

When the input voltage is at  $V_{dd}/2$ , transistors M4 and M5 are on, although driven only with half of the supply voltage, while transistors M3 and M6 are completely off. This pulls OUTL to  $V_{ss}$  and node B to  $V_{dd}$ . The pMOS cross-coupled pair (M1 and M2) pulls node A to  $V_{dd}$  to establish a stable state without dissipating static power, while the nMOS cross-coupled pair (M7 and M8) pulls node OUTH to  $V_{ss}$ . Thus, when the input is in the idle state, the receiver generates logic 0 at both outputs without consuming static power.





If the input swings to  $V_{ss}$  transistor, M4 turns off while M3 turns on. If M3 is large enough to pull the voltage at node A below the threshold value of transistor M2, the transistor turns on. Therefore, the voltage at the output node OUTL rises and transistor M1 turns off. A similar sequence of events occurs when the input swings back to  $V_{dd}/2$ . Now M4 turns on and M3 turns off, again M4 has to be large enough to pull the voltage of the output node below the threshold of transistor M1. M1 pulls up the voltage at node A and turns off transistor M2. Note that high half-swing transitions do not have any influence on this part of the receiver circuit. When the input swings to  $V_{dd}$  transistor M3 is still off, driven by the inverter, and transistor M4 is now fully on, but output node OUTL stays unchanged.

The lower part of the receiver (Fig. 5) follows exactly the same behavior. The difference is that here we have an nMOS cross-coupled pair with a pMOS pull-up network. Transistors M5 and M6 have to be large enough to push the voltage of nodes B and OUTH above the threshold level of transistors M7 and M8, respectively. Because an nMOS cross-coupled load is used the pMOS pull-up transistors have to be considerably larger. Thus, this part of the receiver takes more time to resolve the input transitions and consumes more dynamic power.

### C. Enhanced ATLS System Transmitter

To improve the speed of the transition from  $V_{dd}$  to the middle-rail voltage we propose an enhanced ATLS (EATLS) system transmitter (Fig. 6). This version uses the additional *N*-channel transistor (M4) to pull the output voltage to  $V_{dd}/2$ . This transistor is driven with a full drive voltage and has a full  $V_{dd}$  voltage difference across source and drain. Thus, it is capable of inducing a higher electrical current into the wire, speeding up the transition. To turn off the transistor half way to the opposite supply rail, a simple inverter is used as a comparator (I3, transistors M5 and M6).



Fig. 6. Enhanced ATLS system transmitter.

When the transmitter is in the idle state (inputs InH and InL are low) transistors M1, M2, and M4 are off and M3 is on, driving the output to the  $V_{dd}/2$  supply. Node B is at the high voltage level and the pull-down network of inverter I3 is disabled because transistor M7 is off. There is no static power dissipation despite inverter I3 being driven with  $V_{dd}/2$ . Transistor M8 is off and M5, although half on, pulls node B high.

When input InH goes high, M3 switches off and M1 pulls the output voltage to  $V_{dd}$ . Furthermore, transistor M8 turns on and pulls node B low. This enables the pull-down network of inverter I3, since M7 turns on through feedback inverter I2. Note that at this point transistor M4 remains off since input InH prevents NOR gate NOR2 from switching its output high. Inverter I3 is now driven with  $V_{dd}$  and, therefore, does not fight transistor M8 pulling node B low.

After input, InH switches back to logic 0, transistor M1 turns off and NOR gate NOR2 fires turning transistor M4 on. M4 is now pulling the output voltage toward  $V_{ss}$  at full speed. When the output voltage crosses the threshold level of inverter I3, I3 switches, pulling node B to  $V_{dd}$ . This turns off transistor M4 and disables the pull-down network of inverter I3. However, due to the fact that the transistor cannot turn off instantly, the output voltage overshoots the  $V_{dd}/2$  level by a certain amount. Fortunately, this is highly desirable when driving long on-chip wires because it increases the speed of transition. The lower graph in Fig. 4 shows the output waveforms of the EATLS system. Again, waveforms WDI and WDO present the voltage swing at the output of the transmitter and at the input of the receiver, respectively. We can see that the speed of the transition from  $V_{dd}$ to  $V_{dd}/2$  is greatly increased, and overshoots at the input of the wire are filtered out by the RC characteristic of the on-chip wire. Despite that, we can still reduce (or increase) the overshooting amplitude by adjusting the threshold value of inverter I3.



Fig. 7. Simulation circuit.

Note that transistor M3 also helps pull down the output voltage to the middle-rail supply, but its more important function is to reduce the amplitude of the overshoots and to restore the output voltage level back to  $V_{dd}/2$  if it overshoots.

Because the EATLS transmitter uses a full-swing transistor (M4) to pull the output from  $V_{dd}$  to  $V_{dd}/2$ , the energy stored in the output capacitor (the wire) is dissipated in the transistor during the transition. In the ATLS system, half of the stored energy is transferred back to the power supply. Thus, the power dissipation of the enhanced ATLS system equals

$$P_{\rm dyn} = C \cdot (V_{dd}/2)^2 \cdot f_L + C \cdot V_{dd}^2/2 \cdot f_H$$
(3)

where  $f_L$  is the frequency of low half-swing transitions and  $f_H$  is the frequency of high half-swing transitions. If  $f_H$  equals  $f_L$ , then an enhanced ATLS system has potentially 62.5% lower power consumption than a dual-rail signaling system (providing that switching frequency is the same, the acknowledge signal is ignored, and the transmitter and the receiver power dissipation is ignored).

Note that the enhanced ATLS system transmitter operates correctly only when the communication system follows the four-phase (return to zero) handshaking protocol. Furthermore, the transmitter has to be properly initialized before being used. After reset node B has to be set to logic 1. One way to initialize the transmitter is to implement additional circuitry that will pull node B to logic 1 when a reset signal is applied; for example, a pMOS transistor connected between B and  $V_{DD}$  with the active low reset signal applied to its gate. During the reset input InL has to be kept low for the circuit to initialize properly.

#### **III. TEST ARCHITECTURE AND QUALITY METRICS**

As mentioned in the introduction, we compared a dual-rail signaling system and the ATLS system with respect to speed, power consumption, and reliability. The simulation circuit shown in Fig. 7 comprises two asynchronous pipeline stages connected with a model of a transmission system. "Dummy" gates are added to model the environment behavior. The stimuli generated at the input cause the transmission system to transmit one logic 0 and one logic 1 symbol with a maximum speed limited by the physical characteristics of the CMOS technology used in the simulation.

TABLE I Typical Noise Sources

| Parameter             | Definition                                                  |  |  |  |  |  |  |  |  |
|-----------------------|-------------------------------------------------------------|--|--|--|--|--|--|--|--|
| K <sub>C</sub>        | crosstalk coupling coefficient                              |  |  |  |  |  |  |  |  |
| Attn <sub>C</sub> $*$ | crosstalk noise attenuation: (0.2 for static driver)        |  |  |  |  |  |  |  |  |
| $K_{PS}$ *            | power supply noise due to signal switching:                 |  |  |  |  |  |  |  |  |
|                       | (5% of V <sub>DD</sub> for single-ended switching)          |  |  |  |  |  |  |  |  |
|                       | worst case: $K_N = KC Attn_C + K_{PS}$                      |  |  |  |  |  |  |  |  |
| RxO                   | receiver input offset                                       |  |  |  |  |  |  |  |  |
| RxS                   | receiver sensitivity                                        |  |  |  |  |  |  |  |  |
| PS *                  | power supply noise: (5% of V <sub>DD</sub> )                |  |  |  |  |  |  |  |  |
| Attn <sub>PS</sub>    | power supply noise attenuation                              |  |  |  |  |  |  |  |  |
| TxO                   | transmitter offset                                          |  |  |  |  |  |  |  |  |
|                       | worst case: $V_{lN} = RxO + RxS + Attn_{PS} \cdot PS + TxO$ |  |  |  |  |  |  |  |  |

To provide a fair comparison, the same environment and driving transistors were used for different transmission systems. Furthermore, full swing acknowledge signaling was used for both systems with the same wire length as for the data connection to simulate a real-life communication system.

We measured the period to define the speed of the communication systems. For a four-phase protocol, the period P involves the forward propagation of a valid data value, the reverse propagation of acknowledge, the forward propagation of empty data value, and the reverse propagation of acknowledge [2]. Since ATLS and EATLS systems have different periods when transmitting logic 1 and logic 0, both periods were measured and average results are presented.

To compare the three systems with respect to power dissipation, we measured the energy consumed by the transmission system. The measured values exclude the energy consumed by the acknowledge signals but include the energy consumption of the receiver to generate full-swing transition at the output (in the case of the ATLS system).

#### IV. ROBUSTNESS AND RELIABILITY

There are three main sources of noise that influence the reliability degradation of the signaling system: process variation, voltage supply noise, and crosstalk. To measure the reliability of our circuits we use the worst case analysis method presented in [9] and [10]. The noise sources are classified into two categories: the proportional noise sources and the independent noise sources

$$V_N = K_N \cdot V_S + V_{\rm IN}.\tag{4}$$

 $K_N \cdot V_S$  presents those noise sources that are proportional to the amplitude of the signal swing  $(V_S)$ , such as crosstalk and power supply noise induced by the signal.  $V_{IN}$  consists of the noise sources that are independent of  $V_S$  such as receiver input offset, receiver sensitivity and signal unrelated power supply noise. Table I presents the summary of the noise sources. The parameters designated with an asterisk (\*) were obtained from [9] or [10] and the rest were assessed by the simulation. The worst case *signal-to-noise ratio* (SNR) was used to measure the reliability of the circuits defined as

$$SNR = 0.5 \cdot V_S / V_N. \tag{5}$$



Fig. 8. Period, energy, and energy-delay product versus wire length at constant supply voltage of 3.3 V.



Fig. 9. Period, energy, and energy-delay product versus supply voltage at 10-mm-wire length.

TABLE II NOISE ANALYSIS OF THE PROPOSED SYSTEMS

| Systems   | Vs [V] | K <sub>C</sub> | Attn <sub>C</sub> | K <sub>PS</sub> | K <sub>N</sub> | $K_N V_S [V]$ | RxO [V] | RxS [V] | PS [V] | Attn <sub>PS</sub> | TxO [V] | V <sub>N</sub> [V] | SNR  |
|-----------|--------|----------------|-------------------|-----------------|----------------|---------------|---------|---------|--------|--------------------|---------|--------------------|------|
| Dual-rail | 3.3    | 0.29           | 0.2               | 0.05            | 0.11           | 0.35          | 0.14    | 0.15    | 0.16   | 0.52               | 0.00    | 0.73               | 2.25 |
| ATLS      | 1.65   | 0.29           | 0.2               | 0.05            | 0.11           | 0.18          | 0.11    | 0.01    | 0.16   | 0.45               | 0.08    | 0.45               | 1.82 |
| EATLS     | 1.65   | 0.29           | 0.2               | 0.05            | 0.11           | 0.18          | 0.11    | 0.01    | 0.16   | 0.45               | 0.08    | 0.45               | 1.82 |

## V. RESULTS

All results and plots in this paper were generated using SPICE simulations for the 0.35- $\mu$ m VCMN4 process technology. The on-chip wire is a 0.7- $\mu$ m-wide (minimum width) single conductor in the same silicon process. The models of the wires used in our simulations were constructed from 0.5-mm segments. Values for resistance in ohms and capacitance in farads per millimeter length were obtained by postlayout extraction [8].

The first graph in Fig. 8 shows the period of the communication systems versus the length of the wire. The results confirm that the dual-rail signaling system is the fastest over the entire spectrum of wire lengths. This is expected since it consists of simple inverters. Furthermore, the graph also confirms that an enhanced version of the ATLS system is faster than the basic ATLS system. Although the dual-rail signaling system wins on speed, ATLS (and especially the enhanced version of ATLS) delivers over 70% higher bandwidth per wire on a long on-chip interconnection.

The second graph in Fig. 8 shows the energy consumption of the system versus wire length. The reduced voltage swing enables the ATLS system to consume 50% less energy than the dual-rail signaling system to transmit data over a 10-mm-long on-chip wire. Furthermore, the ATLS system has better energy efficiency over the entire wire-length spectrum, while the EATLS system loses the advantage when the length of wire is reduced down to 2 mm, because the receiver consumes more energy than the transmitter can save. It should be noted that adjusting the overshooting amplitude of the transmitter can reduce the energy consumption of the EATLS system for shorter wires with a very little loss of speed. In our simulations we used transmitters adapted for 10-mm on-chip wires. The third graph in Fig. 8 shows the overall performance of the systems. It is clear that the ATLS system is the system of choice with respect to energy-delay product since it has more than 100% better performance than the dual-rail signaling system.

Although the EATLS system performs better than the ATLS system with respect to speed, its improvement has a negative effect on energy consumption. As shown in the graph the amount of dissipated energy prevails over the improvement in speed. However, we should stress that the EATLS system improves performance only when transmitting a logic 1 (when the voltage on the wire swings from  $V_{dd/2}$  to  $V_{dd}$  and back) and that the results shown in the graphs present the average performance of the system.

To further compare the three systems, we conducted another set of simulations to determine how well they operate with a reduced supply voltage. We gradually reduced  $V_{dd}$  and  $V_{dd}/2$ to 2 and 1 V, respectively, and measured the period and energy consumption of the systems. The results show (Fig. 9) that the dual-rail system is still the fastest and that the ATLS system consumes less energy and is more energy-delay efficient while  $V_{dd}$ is above 2.1 V. But as we further decrease the supply voltage, the period of the ATLS system increases rapidly. This is due to the fact that transistors M5 and M6 (M3 and M4) in the receiver (Fig. 5) do not have enough drive voltage applied to their gates to overcome transistors M7 and M8 (M1 and M2) to switch the output voltage of OUTH (OUTL). For 0.35-µm VCMN4 technology, the voltage swing has to be above 1 V for the ATLS system to operate efficiently. This is approximately 60% above the threshold of the pMOS transistor ( $V_{tp} \approx 0.65$  V). If we consider a more modern process technology (0.18- $\mu$ m with 1.8 V

typical  $V_{dd}$  and  $V_{tp} \approx 0.45$  V), then the required voltage swing is around 0.75 V, which is 0.15 V less than typical  $V_{dd}/2$ .

Furthermore, the graphs show that the EATLS system performs better than the ATLS system as the voltage supply decreases. This is due to the fact that the EATLS transmitter provides much faster transitions from  $V_{dd}$  to  $V_{dd}/2$ , which speeds up the voltage conversion in the receiver.

We performed noise analysis for both systems. The crosstalk coupling coefficient  $K_C$  was obtained from a transient simulation of 10-mm parallel wires at minimal spacing where one wire was driven with a voltage step and the induced voltage was measured on the second wire. Since both systems use static single-ended signaling, the total noise coefficient  $K_N$  is the same. The receiver input offset was assessed by conducting dc voltage transform curve (VTC) simulations on all process corners [10]. The receiver sensitivity RxS and power supply attenuation coefficients  $Attn_{PS}$  were also derived from the VTC curves [10]. The transmitter offset TxO results from the reference supply noise (5% of the reference magnitude). Table II summarizes the results of the noise analysis and shows the SNR numbers.

Both ternary logic signaling systems exhibit the same SNR with an 82% noise margin. This is expected since they differ only in the transmitter part. Compared to a full-swing dual-rail signaling the noise margin of the ATLS system is noticeably lower but considering 50% reduced voltage swing, the worst case SNR is still well above 1.

As the voltage swing decreases the swing independent noise sources get more significant. This will get very important in deep submicrometer technologies where the voltage supply is greatly reduced. To implement the ATLS system successfully in modern CMOS technologies great care has to be taken when designing the power supply network and device matching has to be implemented. Furthermore, full-swing wires should be well isolated from ternary logic signals to reduce crosstalk noise.

#### VI. APPLICABILITY

The idea for the ATLS system was inspired by the growing interest in the area of *systems-on-a-chip* (SoC) where various components and IP blocks are implemented on a single chip and interconnected by a network.

The ATLS system is not intended for conventional bidirectional on-chip buses such as MARBLE [8], but is ideally suited to unidirectional signaling in next generation self-timed networks on a chip such as CHAIN [10] where individual links can employ ATLS on a case-by-case basis.

#### VII. CONCLUSION

We have developed an ATLS system, combining a reduced voltage swing and the use of multivalued logic. The ATLS system has a clear advantage over classical full-swing transmission systems in terms of energy consumption and bandwidth per wire. The ATLS system enjoys fully static design and has zero static power dissipation to further, improving its power-efficiency, but it does need a third supply rail and more complex transmitter and receiver circuits than the classical dual-rail system.

With the arrival of extremely deep submicrometer technologies with  $V_{dd}$ , less then 1 V the ATLS system will clearly reach

its operating limits but with the use of low-threshold devices (for transistors M5-M8 in Fig. 6), as proposed in [11], those limits can be further stretched.

#### ACKNOWLEDGMENT

The authors would like to thank the reviewers for their helpful comments.

#### REFERENCES

- J. B. Kuo and J. H. Lou, *Low-Voltage CMOS VLSI Circuits*. New York: Wiley, 1999.
- [2] J. Sparsø and S. Furber, Principles of Asynchronous Circuit Design: A System Perspective, Dordrecht, The Netherlands: Kluwer, 2001.
- [3] H. M. Aytac, "Ternary logic based on a novel MOS building block circuit," in *Proc. IEEE Int. Symp. Multiple-Valued Logic (ISMVL)*, May 1983, pp. 20–25.
- [4] P. Balla and A. Antoniou, "Low power dissipation MOS ternary logic family," *IEEE J. Solid-State Circuits*, vol. 19, pp. 739–749, Oct. 1984.
- [5] X. Wu and F. Prosser, "CMOS ternary logic circuits," in *Proc. Inst. Elect. Eng.*, vol. 137, Feb. 1990, pp. 21–27.
- [6] J. S. Wang, C. Y. Wu, and M. K. Tsai, "Low power dynamic ternary logic," *Proc. Inst. Elect. Eng.*, pt. G, vol. 135, no. 6, pp. 221–230, Dec. 1988.
- [7] R. Mariani, R. Roncella, R. Saletti, and P. Terreni, "On the realization of delay-insensitive asynchronous circuits with CMOS ternary logic," in *Proc. 3th Int. Symp. Asynchronous Circuits ASYNC*'97, Eindhoven, The Netherlands, 1997, pp. 54–62.
- [8] W. J. Bainbridge, "Asynchronous System-on-Chip Interconnect," Ph.D. dissertation, Dept. Comput. Sci., Univ. Manchester, U.K., 2000.
- [9] W. Dally and J. Poulton, *Digital Systems Engineering*, Cambridge, U.K.: Cambridge Univ. Press, 1998.
- [10] W. J. Bainbridge and S. Furber, "CHAIN: A delay-insensitive chip area interconnect," *IEEE Micro.*, vol. 22, pp. 16–23, Sept./Oct. 2002.
- [11] H. Zhang, V. George, and J. M. Rabaey, "Low-swing on-chip signaling techniques: Effectiveness and robustness," *IEEE Trans. VLSI Syst.*, vol. 8, pp. 264–272, June 2000.



**Tomaz Felicijan** received the B.S. degree in electronics from the University of Maribor, Slovenia, in 1998. He is currently working toward the Ph.D. degree in computer science at the University of Manchester, U.K.

His research interests include on-chip networks at the architecture and circuit level.



**Steve B. Furber** (M'99) received the B.A. degree in mathematics and the Ph.D. degree in aerodynamics, from the University of Cambridge, England, U.K., in 1974 and 1980, respectively.

Currently, he holds an endowed chair, and is a Professor of computer engineering, Department of Computer Science, University of Manchester, U.K. From 1980 to 1990, he was with the Hardware Development Group within the Research and Development Department, Acorn Computers, Ltd., Cambridge, U.K. and was a Principal Designer of

the BBC Microcomputer and the ARM 32-bit RISC microprocessor, both of which earned Acorn Computers Queen's Award for Technology. In 1990, when he joined the University of Manchester, he established the AMULET Research Group, which has interests in asynchronous logic design and power-efficient computing.

Prof. Furber is a Fellow of the Royal Society, the Royal Academy of Engineering, the British Computer Society, and is a Chartered Engineer.