ELSEVIER Contents lists available at ScienceDirect # INTEGRATION, the VLSI journal journal homepage: www.elsevier.com/locate/vlsi # A fast model for analysis and improvement of gate-level circuit reliability Chunhong Chen\*, Ran Xiao Department of Electrical and Computer Engineering, University of Windsor, Ontario, Canada N9B 3P4 #### ARTICLE INFO Article history: Received 15 September 2014 Received in revised form 16 January 2015 Accepted 24 February 2015 Available online 5 March 2015 Keywords: Equivalent reliability Signal and reliability correlation Reliability analysis and improvement #### ABSTRACT Reliability is becoming one of increasingly critical issues for design of modern integrated circuits, due to the continuous scaling of CMOS technology and emerging nano-scale devices. This paper presents a novel method for reliability analysis in combinational circuits with unreliable devices. By using the concept of equivalent reliability, the proposed method promises improvements over the state-of-the-art methods in terms of efficiency, while keeping a high level of accuracy, in estimating the circuit reliability. This efficiency is achieved due to the fact that this work utilizes input probabilities and gate reliabilities directly for reliability evaluation, instead of taking (or sampling) a huge number of input vectors as with most existing approaches. Simulation results on benchmark circuits show that our approach obtains a significant speedup over other existing methods, especially when the reliability evaluation is repetitively needed for a same circuit in order to provide the reliability improvement for reliability-driven design applications. © 2015 Elsevier B.V. All rights reserved. #### 1. Introduction THE TREMENDOUS growth in modern semiconductor industry has been relied on the continuous shrinking of electronic device dimensions over decades. As CMOS technology further scales down, circuit designers are facing new challenges including quantum effects, large power dissipation, low reliability, etc. [1]. In particular, reliability has become one of the increasingly critical issues, partly due to low voltage/current threshold, electromigration, and process variations (such as power supply variation and device mismatch). As CMOS devices reach their fundamental physical limits, on the other hand, non-conventional nanometer-scale electronic components and sophisticated architectures/technologies (such as single-electron tunneling technology, carbon nanotubes, and quantum-dot automata) have been studied and fabricated. These devices typically require low-temperature operation, are more sensitive to a variety of random noises, and thus are statistically less reliable than their CMOS counterparts [2]. This has led to significant interests in reliability analysis, and motivated the investigation of reliability-oriented architectures using unreliable components. Generally speaking, the reliability analysis for combinational circuits is computationally expensive, due to an exponentially growing number of input patterns required as well as possible signal correlations involved. It is understood that the task of determining exactly the output reliabilities for arbitrary logic circuits cannot be solved within a polynomial time. An alternative solution is to either use statistical approaches (such as Monte-Carlo simulation), or resort to heuristic algorithms in order to make a reasonable tradeoff between the accuracy and efficiency in evaluating the reliability of large-scale circuits. A brief review of previous work on reliability analysis will be given in next section. This paper presents a fast model for gate-level reliability analysis and improvement using the concept of equivalent reliability, which is based on the observation that circuit output reliability is a result of the cumulative effects of all unreliable gates within the circuit. When it comes to any specific gate, an equivalent reliability can be used for its output to account for the effects of all errors caused by the gate itself as well as its transitive fan-in cone. By calculating equivalent reliabilities recursively gate by gate throughout the circuit, the whole procedure of reliability analysis can be done in a more efficient way. This efficiency is achieved due to the fact that this work utilizes the input probabilities and gate reliabilities directly for reliability propagation, instead of taking (or sampling) a huge number of input vectors as with most existing approaches. In order to maintain a high accuracy, the proposed model captures the possible signal and reliability correlations in the original circuit by using correlation coefficients from its error-free version. Our simulation results show that the proposed approach provides a significant speedup <sup>\*</sup> Corresponding author. E-mail addresses: cchen@uwindsor.ca (C. Chen), xiao11@uwindsor.ca (R. Xiao). over Monte-Carlo simulation as well as other methods available, while keeping a high level of accuracy. The efficiency of the proposed model is crucial in applications (e.g., circuit reliability improvement) where the reliability evaluation is repetitively needed for large-scale circuits. The remainder of the paper is organized as follows: Section 2 presents a brief overview of reliability analysis and some related works. Section 3 describes the proposed reliability analysis model in details. Section 4 shows simulation results. An application for reliability improvement is discussed in Section 5, and Section 6 concludes the paper. #### 2. Background and prior work Reliability of a logic signal is defined as the probability that its value is correct (or equals to its error-free value). The signal may become unreliable due to the errors of its driving gate and/or input signals. If we use the classical von Neumann model [8] for gate errors, each gate can be associated independently with an error probability $\varepsilon$ (throughout the paper, the terms "error" and "failure" will be used interchangeably). In other words, a gate is modeled as a binary symmetric channel, which can generate a bit flip (from $0 \rightarrow 1$ or $1 \rightarrow 0$ ) at its output (known as von Neumann error) symmetrically with same error probability [5]. The resulting errors are mainly due to random noises and temporary environmental influences, rather than permanent physical defects. In physic level, there are many different sources of noise, such as crosstalk, thermal noise, and cosmic radiations. In order to represent these effects in electric level, a nominal voltage can be used, whose value is traditionally modeled by a Gaussian distribution. Thus, a gate failure probability can be defined as the probability that its nominal voltage exceeds a certain threshold value (i.e., noise margin). In the real world, every gate i in a circuit has an independent error probability $\varepsilon_i$ (or gate reliability $r_i = 1 - \varepsilon_i$ ), which is assumed to be localized and statistically stable. Also, it is assumed that any gate failure probability (or failure rate) is a constant within [0, 0.5] (or, $r_i \in$ [0.5, 1]) regardless of its input signal values, while this may not be always true in some nonconventional circuits [2]. This assumption is just for convenience of discussion, and the proposed approach also applies to the cases where gate failure probabilities do depend on input signal values, as will become clear later in the paper. For any signal inside a circuit, one can generally consider asymmetric reliabilities, depending on its specific error-free value ("0" or "1"). In other words, the signal is associated with a reliability pair $\{r^0, r^1\}$ , where $r^0$ (or $r^1$ ) represents a conditional probability of the signal being "0" (or "1") given its error-free value is "0" (or "1"). More specifically, for any signal s, $r_s^0 = P\{s = s^* | s = s^* | s = s^* \}$ $s^*="0"$ and $r_s^1=P\{s=s^*|s^*="1"\}$ , where $s^*$ is the error-free version of s (for the remainder of the paper, the symbol "\*" is used to indicate "error-free" when referring to signals, and the terms "error-free", "reliable" and "correct" are used interchangeably). Using the reliability pair instead of a single signal reliability allows us to appreciate the difference between $r^0$ and $r^1$ due to the asymmetric behavior of logic gates in terms of error propagation. For instance, for an error-free 2-input AND gate, if one of the inputs has its correct value "0", the output would be a correct "0" regardless of the value or correctness of the other input. Thus, the output signal would be more likely to be correct for this particular case, if its output value is meant to be "0" (i.e., $r^0 > r^1$ ). In this work, unless otherwise stated, primary input signals are assumed to be reliable (i.e., their reliability is 1), and their probabilities are assumed to be independent. Also, the probability of signal s is by default defined as the probability of the signal being logic "1", and is expressed as $P_s$ or $P\{s="1"\}$ . The probability of s being "0" is denoted by P(s="0"), which is equal to $1-P_s$ . As the "0" and "1" represent logic values, the quotation marks around these numbers are sometimes omitted for brevity. The problem of reliability analysis for combinational circuits is stated briefly as follows: for given probabilities of primary inputs and individual gate reliabilities in the circuit, find the reliabilities of individual primary outputs, or more specifically, the reliability pair $\left\{r_j^0, r_j^1\right\}$ for each primary output $F_j$ (j=1, 2, ..., m, where m is the number of primary outputs), where $r_j^0$ (or $r_j^1$ ) denotes a conditional probability that the j-th output is logic 0 (or 1) when its error-free value is meant to be 0 (or 1). Once $r_j^0$ and $r_j^1$ are found, the average reliability for the output is given by $$r_i = P_i^* r_i^1 + (1 - P_i^*) r_i^0 \tag{1}$$ where $P_j^*$ is the probability of primary output $F_j$ (or, more exactly, $F_j^*$ ) when the circuit is error-free. The actual probability of the output (when the circuit is not error-free) can be calculated as $$P_i = P_i^* r_i^1 + (1 - P_i^*)(1 - r_i^0)$$ (2) As a typical statistical method, Monte-Carlo simulation [5] has been widely used for reliability evaluation. However, a large number of simulation runs are statistically required to reach a stable result, and it may take up to hours for the MC method to obtain good results for relatively-large circuits. While improvements in efficiency can be made using some speedup procedure (such as non-Bernoulli sequences introduced in [9]), the computation time still grows exponentially with circuit size. Another drawback with simulation-based methods is lack of flexibility, as the simulation process needs to be repeated for any changes with gate failure rates. Recently, some analytical methods for reliability calculation have been proposed, such as probabilistic gate models (PGMs) [1], probability transfer matrices (PTMs) [3], Bayesian network [4], and Boolean difference-based error calculator (BDEC) [6]. The PGM method works perfectly for small circuits or correlation-free circuits. The PTM and Bayesian network techniques can provide accurate results, but remain computation-intensive for large circuits with signal correlation, due to an exponential runtime with the number of reconvergent fanouts or the demand for prohibitively-huge data storage in either probability transfer matrices or conditional probability tables. Since these methods try to exhaustively calculate joint probability distributions within a circuit, they are considered as bruteforce ways to solve an NP-hard problem. The BDEC is a fast gate-level probability error propagation model, where only local reconvergent fanouts are considered by so-called level collapsing. In [7,10], some hybrid methods were also investigated by considering the combination of exact analysis with probabilistic measures. However, these approaches suffer from unacceptably high errors in estimating the circuit reliability, due to the fact that the signal correlation and/or reliability correlation within a circuit are not well captured. The proposed method is designed for high efficiency while keeping a good accuracy for reliability analysis. While many existing methods also use the information on input probabilities and gate reliabilities for evaluation of the output reliabilities, what sets the proposed model apart from others is the fact that in this work, the output reliability of a gate can be expressed as a function of reliabilities of its inputs and the gate itself, as will be shown later in the paper. This ensures that there is no need to take a huge number of input samples, making the evaluation procedure much more efficient for large-scale circuits. This is significantly important for large circuits especially when performing the reliability improvement which requires repetitive evaluation of output reliabilities (see Section 5 for details). ### 3. Proposed reliability analysis: ER model The proposed reliability model is based on the concept of *equivalent reliability* (ER), and hence named the ER model. The main idea is to calculate the equivalent reliability at the output for a specific ## Download English Version: # https://daneshyari.com/en/article/6942339 Download Persian Version: https://daneshyari.com/article/6942339 <u>Daneshyari.com</u>