

Available online at www.sciencedirect.com





www.elsevier.de/aeue

Int. J. Electron. Commun. (AEÜ) 62 (2008) 455-458

LETTER

### Scheduling algorithm for VOQ switches

Dusan Banovic\*, Igor Radusinovic

Department of Electrical Engineering, University of Montenegro, Bul. Petra Cetinjskog 2, 81000 Podgorica, Serbia and Montenegro

Received 15 September 2006; accepted 6 June 2007

#### Abstract

A variety of matching schemes for VOQ switches that provide high throughput for uniform traffic have been proposed. The dual round robin matching (DRRM) scheme has performance similar to iSLIP and lower implementation complexity. DRRM with exhaustive service (EDRRM) algorithm was created as modification of DRRM algorithm with goal to improve performance of DRRM algorithm for bursty and non-uniform traffic conditions. Under extremely unbalanced arrival traffic, an exhaustive service policy may lead to unfairness and starvation. This paper proposes matching scheme for VOQ switches that provides high throughput almost the same as EDRRM scheme and avoid unfairness and starvation under unbalanced traffic. © 2007 Elsevier GmbH. All rights reserved.

Keywords: Switching; Scheduling; VOQ simulator; Frame

#### 1. Introduction

Advancement in data transmission technology, particularly the deployment of optical technology, has provided high-transmission bandwidth in a cost-effective manner. Without faster packet switch designs, Internet cannot continue to scale up to higher data rates. One of the most used architectures for design of high-speed switches is crossbar. Crossbar switching fabric is interesting for implementation because of their non-blocking capability and simplicity. Switching of fixed length packets is a widely accepted method for design of high-speed packet switches. Packet switching based on input queuing is desirable for design of high-speed switches because of smallest memory and internal operation speed requirements. But on the other side, it is well known that IO switch is throughput limited (58, 6%) [1]. The reason for this critical drawback is the head of line (HOL) blocking effect. Many techniques have been suggested for HOL reduction and elimination.

\* Corresponding author.

E-mail address: dusan.banovic@cg.yu (D. Banovic).

One of this techniques, called virtual output queueing (VOQ), completely overcame HOL blocking effect without request for higher internal operation speed factor. Because of that, VOQ crossbar switch is very interesting for designing high-speed switches. In VOQ solution (shown in Fig. 1 [2]) there is separate queue for each of N outputs.

VOQ crossbar switch is configured using scheduling algorithms. Weighted algorithms consider request weights when making scheduling decisions, while non-weighted algorithms do not. Examples of weighted algorithms are maximum weight matching (MWM) algorithms longest queue first (LQF) [3], oldest cell first (OCF) [3] and longest port first (LPF) [2]) and MWM algorithms (iLQF [3], iOCF [3] and iLPF [2]). They give switch good performance for all traffic patterns if arrivals are independent and traffic is admissible, but their complexity is too high to be considered for practical implementation. In addition to being complex, the existence of comparators adds additional complexity to the weighted algorithms (even with the fastest implementation, the two-input integer comparator still takes  $O(\log b)$ time units to complete the comparison, where b is number of bits for weight parameters presentation (queue length or

<sup>1434-8411/\$ -</sup> see front matter © 2007 Elsevier GmbH. All rights reserved. doi:10.1016/j.aeue.2007.06.004



Fig. 1. VOQ  $N \times N$  crossbar switch.

number of HOL cells)). On the other side, examples of non-weighted algorithms are MSM algorithms. These algorithms maximize the number of connected inputs and outputs in each time slot and maximize instantaneous allocation of bandwidth. The most efficient MSM algorithm has complexity  $O(N^{2.5})$  [4]. Because of this it is only of theoretical value. There are a number of algorithms which approximate MSM algorithms. Those algorithms are simple and fast to implement in high-speed switches. The group of algorithms which can find practical implementation in highspeed switches are: iSLIP [2], DRRM [5], EDRRM [6], u-FORM [7], etc. This paper is organized as follows: Section 2 describes the proposed matching scheme for VOQ crossbar switches. Section 3 gives performance comparison between scheduling algorithms for different traffic conditions.

#### 2. FDRRM matching scheme

In the DRRM scheme [5], each input selects one nonempty VOQ by round robin, and each output accepts one of the multiple requests it receives, also in round robin order. When an input and an output are matched, only one cell is transferred from the input to the matched output. In EDRRM [6], pointers are updated in a different way from DRRM. In a time slot if an input is matched to an output, one cell in the corresponding VOQ will be transferred. After that, if VOQ becomes empty, the input will update its arbiter pointer to the next location in the fixed order; otherwise pointer will remain at the current VOQ, so that a request will be sent to the same output in the next time slot. Compared to DRRM and iSLIP, EDRRM has higher throughput under non-uniform traffic. The delay of EDRRM is less sensitive to traffic burstiness. Under extremely unbalanced arrival traffic, an exhaustive service policy may lead to unfairness and starvation. We propose matching scheme for VOQ switches that provides high throughput almost the same as EDRRM scheme and avoids unfairness and starvation under unbalanced traffic. Frame-based round Robin matching (FDRRM) scheme is based on round robin matching selection and frame-based concept. A frame is comprised by one or more cells that can be considered for eligible matching [8]. The frame size associated with a VOQ is determined by the occupancy of the queue at the time when the first cell of the frame has been matched. For each VOQ there is frame size counter  $C_{i,i}(t)$ . The frame size counter is equivalent to the occupancy of a VOQ at a given time t. At the time  $t_a$  of matching the first cell of the frame associated with  $VOQ_{i,i}$  a framesize counter has value  $C_{i,j}(t)$ . Cells arriving to VOQ<sub>i,j</sub> at time  $t_b$ , where  $t_b > t_a$  are not considered for matching until the current frame is totally served and a VOQ is selected the next time in a round robin schedule.  $C_{i,j}(t)$  decreases its count by one each time a cell is matched. The input pointer is incremented (modulo N) to one position beyond requesting output if request is not granted or if the last cell of granted frame is served. A detailed description of the two-step FDRRM algorithm follows:

- Each input sends request for service corresponding to the first non-empty VOQ in a fixed, round robin schedule, starting from the current position of input pointer. The input pointer is incremented (modulo N) to one position beyond requesting output if request is not granted in phase 2 or if the last cell of granted frame is served. The frame size associated with a VOQ is determined by the occupancy of the queue at the time when the first cell of the frame has been matched. At the time t of matching the first cell of the frame associated with VOQ<sub>*i*</sub>, *j* a frame-size counter has value  $C_{i,j}(t)$ .  $C_{i,j}(t)$  decreases its count by one each time a cell is matched until the last cell of frame is served. If request is granted in phase 2 pointer remains at same position.
- If an output receives one or more requests, it chooses the one that appears the next in fixed round robin schedule starting from the current position of the pointer. The pointer is moved to this position. The output notifies each input whether or not its request was granted. The pointer of the output arbiter remains at the granted input. If there are no requests, pointer remains at same position.

The implementation complexity of FDRRM is almost identical to that of DRRM and EDRRM. Only hardware addition is one counter. The length of each control message is 1/Nth of that in iSLIP. This makes FDRRM an efficient and implementable scheme.

# **3.** Results of performance analysis of VOQ switches

For performance analysis of VOQ switches we use the second version of software tool, we called VOQ simulator [9]. In this paper are presented some of these results for iSLIP, DRRM, EDRRM, and FDRRM algorithms for specified traffic conditions (uniform, bursty, and non-uniform). We considered  $32 \times 32$  VOQ crossbar switch for this purpose. Fig. 2 illustrates performance (average delay depending on

Download English Version:

## https://daneshyari.com/en/article/445337

Download Persian Version:

https://daneshyari.com/article/445337

Daneshyari.com