### ARTICLE IN PRESS

INTEGRATION the VLSI journal xx (xxxx) xxxx-xxxx



Contents lists available at ScienceDirect

## INTEGRATION, the VLSI journal



journal homepage: www.elsevier.com/locate/vlsi

# Threshold adjustment of receiver chip to achieve a data rate > 66 Gbit/sec in point to point interconnect

Alak Majumder<sup>a,\*</sup>, Abir J. Mondal<sup>a</sup>, Bidyut K. Bhattacharyya<sup>b</sup>

<sup>a</sup> Department of Electronics & Communication Engineering, National Institute of Technology Arunachal Pradesh, Yupia 791112, India
<sup>b</sup> Department of Electrical Engineering, National Institute of Technology Agartala, Jirania 799046, India

#### ARTICLE INFO

Keywords: Driver Receiver Data rate ISI noise Threshold voltage Design margin

#### ABSTRACT

In multi-Gbps chip-to-chip signalling, the transmitter clock jitter limits the maximum possible data speed for a required Bit Error Rate (BER). If the communicating channel between transmitter and receiver is lossy and noisy, the waveform of the driven signal, detected at the receiver chip changes completely due to the properties of channel made of copper interconnect. The reflection noise, generated at various points of the channel due to impedance mismatch, makes identification of the signal very difficult at the receiver. Due to that, it is challenging to send high frequency signals (pulse width of <100 psec corresponding to >10 Gbit/sec data rate) between two chips in manufacturable environment for high volume products with various process parameters. In this work, we have discussed two methods of defining threshold voltage of receiver chip by which it is possible to send a square pulse of 15 picoseconds or less width (that may corresponds to 66 Gbit/sec or more) over a unmatched lossy channel. The receiver chip of proposed method uses a comparator, which is fed by the received oscillatory signal (Signal-A) and RC delayed version (Signal-B) of it. In order to avoid needless switching at the output, we also have incorporated conventional comparator hysteresis loop such that the feedback path will control the threshold depending on the comparator output.

#### 1. Introduction

Continuous advancement in VLSI domain has led to largely complex chips with huge number of interconnections that integrate the components. The demand of ultra-small chip with higher speed has made the research community to switch towards multilayer and multilevel interconnections, which plays a vital role in determining the delay, power consumption and operating frequency of high speed digital systems [1]. The limitation of performance of many digital systems is due to less interconnection bandwidth between two chips, packages, boards and cabinets. Recent developments in video applications along with the expansion of the volume of data traffic have raised the demand of high data rate in computer server. This ever increasing need for high data rate in the server design requires data transmissions from CPU to CPU over the back plane of the server while using point to point interconnects [2]. The first ever work was done in the year 1992 by Anna Madrid et al. [3] from Intel Corporation on point to point interconnect. The same concept [3] is being used today by Intel for sending signal between two chips. Intel found out the developed version of point to point interconnect which is known as Quick path interconnect (QPI) [4] during the period of mid 2008. Even though

point to point interconnect is one of the best way to send high frequency signals, between two chips, but due to impedance mismatches at various locations in the channel, the signals received at the receiver is still noisy that causes and limits the high frequency data transfer. Even today, the maximum data rate is no more than 7 Gbit/ sec to 10 Gbit/sec.

The quick path interconnect has increased the speed of the signal a lot, but the problem of ISI (Inter symbol Interference) noise still exists. This originates due to the fact that at the receiving end signals does not decay during the period of time when the next signal is sent from the transmitters [5–8]. It sometime causes overdesign of a system to eliminate the error which is purely a circuit simulation error. Eventually the cost of the product gets increased to compensate such error, which even does not exist in reality. Mathematically, the received voltage inside the receiver chip is expressed in Eq. (1). Voltage at the receiver chip at time, t=0, depends on the voltage received at time, t=0 and the voltage available from the signal which was sent by the transmitter in previous times. Here, we are assuming that data sent by the transmitter at every  $\tau$  interval, had a magnitude of  $V_0$ . The value of  $V_0$  can be either zero or some finite voltage depending on logic 0 or 1 respectively.

\* Corresponding author.

E-mail address: majumder.alak@gmail.com (A. Majumder).

http://dx.doi.org/10.1016/j.vlsi.2016.11.004 Received 10 April 2016; Received in revised form 15 November 2016; Accepted 21 November 2016 Available online xxxx 0167-9260/ © 2016 Elsevier B.V. All rights reserved.

$$V_R(0) = V_{R0}(0) + V_{R0}(\tau) + V_{R0}(2\tau) + \dots + V_{R0}(N\tau) = \sum_{n=0}^{N\tau} V_{R0}(n\tau)$$
(1)

where  $V_{R0}(n\tau)$  is the value of the voltage at the receiver at time  $t = n\tau$ , when square pulses, having pulse width  $\tau$  was driven by the transmitter at time  $t = -n\tau$ . If the transmitter sends a square wave pulse having pulse width  $\tau$ , then the  $V_{R0}(t)$ , at the receiver, can be written as shown in Eq. (2). It is important to mention again that,  $V_{R0}(n\tau)$  [n=0,1,2,...,N] is the value of the signal available at the receiver at time t=0, based on the signal sent by the transmitter at time  $t = -n\tau$  [n=1,2,3,...,N].  $V_R(0)$  is the value of the voltage measured at the receiver at time t=0, given many signals are sent before t=0 at an interval of every  $\tau$  seconds.

$$V_{R0}(nt) = V_{n0}H(|n\tau|) \tag{2}$$

The objective of high speed signalling scheme is to determine the value of  $V_R(0)$  at time  $t = -n\tau$  (n=0,1,2...N) which could be either 0 for logic 0 or some finite positive voltage for logic 1. This can be done using Eq. (2) in Eq. (1) to calculate or determine the value of  $V_{RO}$  for various times before t=0.

If the channel behaves differently due to crosstalk and mutual couplings of multiple lines switching around the victim line, then it is almost impossible for one to detect the digital output from the distorted signals received at the receiver. In that case, H(nt) depends on the switching conditions of the surrounding signals. The company like Agilent Technologies [9] has developed equipments on the basis of Decision Feedback Equalization (DFE) and Feed-Forward Equalization (FFE) to reconstruct the original signals, but that process is still limited to 10 Gbit/sec data rate for any present day's server design. In 2002, Bryan et al. [10] describes an efficient method for multi-Gbps chip-tochip signalling accounting for ISI, crosstalk, echos as well as circuit effects like thermal noise and jitter. Ho et al. [11], had shown the Common-mode signalling is effectively used to create a backchannel communication path over the existing pair of wires for a self-contained adaptive differential high-speed link transceiver cell. The measured results indicate that this backchannel achieves reliable communication without noticeable impact on the forward link for bandwidths up to 50 MHz and swings of 20-100 mV. The paper [12] shows multi-tap equalization both at the transmitter and receiver to get the highest operation frequency using Pulse Amplitude Modulation for 2-level and 4-level voltage modulations. Bit-error rate less than 10-15 and power equal to 40 mW/Gb/s have been measured when operating over a 20 inch backplane with two connectors at 10 Gbit/s data rate. In the paper [13], authors compared the voltages at the receiving end of the lossy transmission line while using equalized pulse and sinusoidal input voltage. It is shown that for a transmission line, for a given total system loss, the equalization technique does not work after a certain frequency as continuous equalization reduces the signal amplitude at the receiving end. In 2004, Bhattacharyya et al. [14] showed that, some minor modification on the system design can lead to 25 Gbit/sec data rate, while one can sample only few mV signals at the receiver. They also pointed that, it is possible for 24" channel length, which is made out of copper. That was the first paper written to describe that it is possible to go much higher in frequency to achieve >25 Gbit/sec data rate using Copper Interconnect. After that within four months, Lucent Technology from England demonstrated that one can indeed sample 25 Gbit/sec data in the present system with 24" line length [15]. It was also observed by Bhattacharyya et al. [16] that all tool generates small simulation errors in output voltages (can be as low as 1 µV out of a 1 volt) which may get magnified while computing the peak distortion analysis (PDA), to estimate the maximum magnitude of ISI noise and bit error rate, as if, those simulation errors were real Inter symbol Interference (ISI). This was also corrected in the same article.

From the above discussions, it can be summarized that it is indeed possible to achieve a data rate of 25 Gbit/sec or more ( $\tau$ =40 psec or less) over the conventional channels (like PCB, sockets, packages etc) of 24" line length, made out of copper. The objective of this work is to

understand, integrate and partially to bridge the gap of the above work, to see if we can generate a 15 psec pulse inside the receiver when the driver is driving a single pulse, having pulse width  $\tau$  close to 15 picoseconds that corresponds to 66.67 Gbit/sec. We have also shown a methodology of defining threshold voltage  $(V_T)$  of the receiver chip so that the received oscillatory signal with small amplitude can make the receiver chip ON. The work is done over an unmatched channel. This avoids the computational work of Eqs. (1) and (2) using DFE and FFE, as developed by Agilent Inc. As an example, if the transmitter sends a pulse having 15 psec at 1 Volt, then at the receiver, the pulse width may increase to 0.5-1 nsec, due to channel characteristics and the pulse height may get reduced to 100 mV. This paper shows, a simple RC delay circuit which will allow us to get full voltage swing inside the receiver which in turn can be used to understand the silicon interconnect and gate responses inside the chip for pico-second pulses.

#### 2. Proposed design methodology

The development of CPUs guided by famous Moore's law states that the performance will be doubled while the feature size will come down to another generation after every one and half years. This has made the research community to think of interconnect length comparable to the wavelength of driven signal. On the other hand, the decrease in feature size will create to coupling and crosstalk leading to the problems like Electromagnetic Compatibility (EMC) and electromagnetic Interference (EMI) [17]. Fig. 1 shows the proposed design methodology of how the driver is connected with receiver chip. The new process is to input the received lousy distorted signal due to channel and to divide it into two parts inside the Chip-2 at the receiver right after bond pad. Signal-A (V(t)) drives the non-inverting terminal of the comparator, whereas Signal-B  $(V(t - \tau))$  drives the inverting terminal. Under that condition, the output of comparator inside Chip-2 will exactly be same as the pulse driven by Chip-1, provided the threshold voltage of the comparator is designed to be slightly greater than the second maximum voltage swings of the difference in voltage ( $\phi(t)$ ) between Signal-A and Signal-B. The difference  $(\alpha)$  between the first peak and the second peak will be the actual signal strength, due to the louse and lossy channel, before it enters in the comparator. Also in order to compensate the maximum voltage swing of any oscillations at the receiver we have used the conventional comparator hysteresis [18] loop at Chip-2 between time  $\tau < t < 2\tau$  if driven pulse width is  $\tau$ . This is what, is new in our design process and methodology to generate a desired pulse width at the receiver and is discussed in Section 3.2. The hysteresis also prevents tripping the output of the comparator after the pulse gets reconstructed. This hysteresis of the comparator is done because it is hard to know the actual voltage difference of the Signal-A and Signal-B in real life.

In Fig. 2, we have shown the experimental set-up of interconnection between two chips. The transmission line parameters are assumed to be  $R = 100 \ m\Omega$ ,  $L=250 \ \text{pH}$  and  $C=100 \ \text{fF}$  for each segments. A total of about 30 RLC segments are there in the present channel. We have also



Fig. 1. Proposed design methodology of connecting two chips.

Download English Version:

# https://daneshyari.com/en/article/4970632

Download Persian Version:

https://daneshyari.com/article/4970632

Daneshyari.com