ELSEVIER

Contents lists available at ScienceDirect

# Nuclear Instruments and Methods in Physics Research A

journal homepage: www.elsevier.com/locate/nima



# The characterization and application of a low resource FPGA-based time to digital converter <sup>☆</sup>



Alessandro Balla <sup>a</sup>, Matteo Mario Beretta <sup>a</sup>, Paolo Ciambrone <sup>a</sup>, Maurizio Gatta <sup>a</sup>, Francesco Gonnella <sup>a</sup>, Lorenzo Iafolla <sup>a,b,\*</sup>, Matteo Mascolo <sup>c,d</sup>, Roberto Messi <sup>c,d</sup>, Dario Moricciani <sup>c</sup>, Domenico Riondino <sup>a</sup>

- <sup>a</sup> National Laboratories of Frascati (LNF) of INFN, via E. Fermi 40, 00044 Frascati (RM), Italy
- <sup>b</sup> University of Rome "Tor Vergata" Electronic Engineering Department, Italy
- c Roma-2 Department of INFN, via della Ricerca Scientifica, 1, 00133 Rome, Italy
- <sup>d</sup> University of Rome "Tor Vergata" Physics Department, Italy

#### ARTICLE INFO

Article history:
Received 14 February 2013
Received in revised form
23 October 2013
Accepted 12 December 2013
Available online 2 January 2014

Keywords: DAΦNE DAQ FPGA HEP KLOE-2 TDC

#### ABSTRACT

Time to Digital Converters (TDCs) are very common devices in particles physics experiments. A lot of "off-the-shelf" TDCs can be employed but the necessity of a custom DAta acQuisition (DAQ) system makes the TDCs implemented on the Field-Programmable Gate Arrays (FPGAs) desirable. Most of the architectures developed so far are based on the tapped delay lines with precision down to 10 ps, obtained with high FPGA resources usage and non-linearity issues to be managed. Often such precision is not necessary; in this case TDC architectures with low resources occupancy are preferable allowing the implementation of data processing systems and of other utilities on the same device. In order to reconstruct  $\gamma\gamma$  physics events tagged with High Energy Tagger (HET) in the KLOE-2 (K LOng Experiment 2), we need to measure the Time Of Flight (TOF) of the electrons and positrons from the KLOE-2 Interaction Point (IP) to our tagging stations (11 m apart). The required resolution must be better than the bunch spacing (2.7 ns). We have developed and implemented on a Xilinx Virtex-5 FPGA a 32 channel TDC with a precision of 255 ps and low non-linearity effects along with an embedded data acquisition system and the interface to the online FARM of KLOE-2. The TDC is based on a low resources occupancy technique: the 4 × Oversampling technique which, in this work, is pushed to its best resolution and its performances were exhaustively measured.

© 2013 Elsevier B.V. All rights reserved.

#### 1. Introduction

TIME to digital converters are largely used in high energy physics and in others fields of science and engineering. Since they are employed with many types of detectors and in many kinds of experiments, it is often necessary to make a trade-off between the most important features [1–3]: resolution, range, precision and nonlinearities, environment variation effects, dead time, readout speed and customizability of its DAQ. The last feature makes the TDCs implemented on the FPGA very attractive; besides nowadays, FPGA-based TDCs reached very good precisions (down to few picoseconds) [4–9]. Using an FPGA-based TDC requires others compromises to be reached, so you need to consider also: how many resources are needed to implement the TDC, how simple it is

to correct nonlinearities, limitations on the choice of the FPGA device, how simple it is to Place And Route (PAR, Section 3.6, [10]) the circuit in the FPGA lattice, etc. Actually to match the setup/hold time, typical requirements for the routing of a standard logic system are of the order of the clock period (at best 1.8 ns in a Virtex-5); instead typical requirements, to reduce the nonlinearities effects of a TDC, are of order of its resolution (from few picoseconds up to 1 ns): this makes the implementation of TDCs challenging.

We developed a readout system based on a TDC for the KLOE upgrade (KLOE-2, [11]). This is a particle physics experiment which studies Kaon physics and works with the Double Annular  $\Phi$  Factory for Nice Experiments (DA $\Phi$ NE) accelerator which stores and collides  $e^-$  and  $e^+$ . For the upgrade we built a new couple of position detectors, the HETs [12,13], to study the  $\gamma\gamma$  physics. These are made of 29 plastic scintillators coupled with 29 photomultipliers; the signals coming from photomultipliers are discriminated and shaped by a custom frontend electronics. We need a readout system able to:

(1) measure the arrival time of the discriminated signals with resolution better than 2.7 ns;

<sup>\*</sup>This work was supported by the Italian National Institute of Nuclear Physics (INFN).

<sup>\*</sup>Corresponding author at: LNF, via E. Fermi n. 40, 00044 Frascati (RM), Italy. Tel.: +39 3495095361.

E-mail addresses: lorenzo.iafolla@lnf.infn.it, lorenzo.iafolla@agi-tech.com (L. Iafolla).

- (2) process data; and
- (3) interface with KLOE-2 trigger and acquisition systems.

All these requirements led us to examine various possible TDC architectures and to focus on those which allow us to implement on the same FPGA all the parts we need.

Most works in literature are focused on the improvement of the resolution and the precision of the TDCs and the problem of the available resources is often not discussed. Commonly used techniques are based on Multitapped Delay Lines (there is a good review in Ref. [14] and some examples in Refs. [4–9,15]); the most advanced provide also methods for the correction of the nonlinearities and for the improvement of the precision: the best results are obtained with the wave union technique [4]. Even if the precision of this technique is still unsurpassed for the FPGA based TDC, sometime, like for HET detectors, is useless.

So we decided for the  $4 \times$  Oversampling technique [16–19] which is often used just for the communication purposes and not to measure the time, is quite poor for the resolution, but offers others advantages: the most attractive, for our case of study, are the low resources occupancy and that does not limit the selection of the device. A detailed study of its characteristics, especially if used as TDC, was not found in literature and it will be shown here. Actually, in the work described in this paper, we have pushed the resolution of the  $4 \times$  Oversampling technique to its limit by implementing a TDC which uses very few FPGA resources.

We decided to use a general purpose device, with good performances and the possibility to use it also in different projects, because it does not make any special difference and our requirements were met anyway. This thanks to the  $4\times$  Oversampling technique which has no special necessity to be implemented but the possibility to generate four clock signals with the same frequency and a  $90^\circ$  phase shift. Most of the devices have some clock management block inside and the last requirement is very often satisfied. All the issues discussed here about the use of the  $4\times$  Oversampling, can be dealt with the same considerations for any device one decides to use. Special cares have to be taken for the "place and route" of the critical parts of the system (see Section 3.6): this has to be done looking at lattice of the used device.

In the next section we will introduce the techniques we used (Nutt interpolation and  $4 \times \text{Oversampling}$ ) and, for comparison, the Multitapped Delay Line technique that is also the starting point for most of the architectures. Afterwards the characteristics of the TDC will be examined.

#### 2. TDC architecture

### 2.1. The Nutt interpolation method

The simplest way to implement a TDC is to use a "coarse counter" that starts counting with the "start" pulse and ends with the "stop" pulse. To improve the resolution of such converter the clock period  $T_{\rm clk}$  must be decreased as much as possible: with a Virtex-5 FPGA this means a max-resolution of about 2 ns.

The method of the counter has a peerless advantage: the almost unlimited range. Increasing the number  $N_{\rm bit}$  of the counter bits the range R of the TDC will increase as well:

$$R = 2^{N_{\text{bit}}} T_{\text{clk}} \tag{1}$$

This is why the method of the counter is always used in a combination with other techniques that improve its resolution: this is called "Nutt interpolation method" or simply "interpolation method" [1,2,20]. The time interval T to be measured is divided into three intervals (Fig. 1). One interval  $\Delta t_{12}$  (which may be quite long) is measured in real time by the coarse counter; the remaining two short intervals,  $\Delta t_1$  and  $\Delta t_2$  (at the beginning and at the end of



Fig. 1. Time diagram of the Nutt interpolation method.

the interval *T*), are measured by two (or one) high resolution TDCs that are usually called interpolators (Fig. 2). Since the interpolator measures the time between the START/STOP pulse and the next positive edge of the clock, its range has to be just longer than one clock period (typically few nanoseconds).

So, the main difference between TDC architectures is on the interpolator. Since the FPGA can implement only digital circuits, the only interpolators we can use are the digital ones. There are two main families of digital interpolators: the first is based on multitapped delay lines; the second is based on two oscillators with slightly different frequencies [21,22]. The second approach is, anyway, rarely used because of the dead time during the measurement. Even if the  $4 \times$  Oversampling can be traced back to the multitapped delay line family, we will treat it as a third method because of its peculiarities.

### 2.2. The multitapped delay lines TDC

The most used interpolator architectures are based on the multitapped delay lines and on the multitapped delay lines in Vernier configuration [4–9,14,15]. In these cases the delay lines are fed with the STOP/START pulses and the outputs are sampled by flip-flops at the rising edge of the clock (Fig. 3). Afterwards decision logic produces a measure of the time between the positive edge of the clock and the input signal.

Since the FPGAs are not designed for TDC implementation, it is not easy to realize a multitapped delay line. Even if you can find embedded delay lines in some FPGAs (Virtex-5 for example) they are not suitable to implement architectures like that shown in Fig. 3 because they have just one output. Anyway there are other possible solutions. For example the Configurable Logic Blocks (CLBs or Slices in the Virtex FPGAs) have carry-chains for the implementation of adders: these chains are very fast and the delays between two outputs are very small (few hundred of picoseconds), so they can be used as multitapped delay lines. Unfortunately these delays are not equal each other and some of them are much bigger than the others; moreover the quantization step is not an integer fraction of the clock period: these cause big nonlinearity effects. Each interpolator must have a range longer than 1 clock period so the tapped delay line must be at least 1 clock period long. For example, since the delay time of the carry-chain in one Virtex-5 slice (from the input to the output) is about 100 ps, we need 25 slices to implement a 2.5 ns (400 MHz clock period) delay line. Besides, much additional logic is necessary to implement some algorithms to correct the nonlinearity effects and to improve the resolution [4–9]. Similar conclusions are valid also for the others techniques used to implement multitapped delay lines.

## Download English Version:

# https://daneshyari.com/en/article/1822739

Download Persian Version:

https://daneshyari.com/article/1822739

<u>Daneshyari.com</u>