## Accepted Manuscript

Exploration of a Scalable and Power-efficient Asynchronous Network-on-Chip with Dynamic Resource Allocation

Charles Effiong, Gilles Sassatelli, Abdoulaye Gamatie

 PII:
 S0141-9331(18)30047-4

 DOI:
 10.1016/j.micpro.2018.05.003

 Reference:
 MICPRO 2684

To appear in: Microprocessors and Microsystems

Received date:5 February 2018Accepted date:7 May 2018

Please cite this article as: Charles Effiong, Gilles Sassatelli, Abdoulaye Gamatie, Exploration of a Scalable and Power-efficient Asynchronous Network-on-Chip with Dynamic Resource Allocation, *Microprocessors and Microsystems* (2018), doi: 10.1016/j.micpro.2018.05.003

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.



### Exploration of a Scalable and Power-efficient Asynchronous Network-on-Chip with Dynamic Resource Allocation

Charles Effiong, Gilles Sassatelli, Abdoulaye Gamatie

Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier (LIRMM), CNRS - Université de Montpellier Email: firstname.lastname@lirmm.fr

#### Abstract

Networks-on-Chip (NoCs) are now being used to provide inter-core communication for manycore Systems-on-Chip (SoCs). This is because traditional on-chip interconnects do not scale with increasing number of cores. Typical NoCs dedicate a set of buffers to their input and/or output ports. This can lead to buffer under-utilization for applications with non-uniform traffic characteristics. In order to provide improved buffer utilization for performance gains and energy efficiency, we have proposed the *Roundabout-NoC* (R-NoC) concept. *R-NoC* is inspired by real-life multi-lane traffic roundabouts. It consists of lanes shared by multiple input/output ports. This allows the buffers to be exploited for performance gains by data-flows from multiple input ports. This work extends the asynchronous evaluation of the *R-NoC*, named *R-NoC-A*. The router is evaluated using 45nm CMOS technology. The *R-NoC-A* router achieves a throughput of 465*M flit/sec* and a network saturation throughput of 129*Gbps* on a 4x4 mesh network. *R-NoC-A* results, in terms of performance, area and power consumption, are highly competitive to existing solutions (synchronous and asynchronous). It provides good topological trade-offs for significantly improving network performance without corresponding area overhead.

Keywords: Network-on-Chip, roundabout, asynchronous design, buffer-sharing, router, exploration, power, performance.

#### 1. Introduction

The demand for better performance and more power-efficient computing systems has led to the transition from single to manycore computing systems [1, 2]. Manycore systems can deliver faster computation due to higher level of parallelism. Compared to traditional bus/crossbar interconnects, the Network-on-Chip (NoC) has emerged as the de-facto on-chip interconnect for manycore systems [1]. In manycore computing systems, the NoC is a dominating force, influencing the entire system performance and power dissipation [3]. This calls for NoCs that can deliver more scalable and power efficient communication in manycore systems.

Fig. 1 shows a mesh network topology and the architecture of a typical input buffered router. The network consists of multiple routers connected by point-to-point data-links. The router is responsible for routing packets from source node to destination node in the network. Most typical NoC routers are composed of input and/or output buffers. An input buffered router architecture is given in Fig. 1. The buffers are used to temporarily store packet that cannot advance to their desired output ports due to network contentions. The NoC router buffers have been shown to be very influential in terms of area [4], energy/power [5] and network performance [4]. On the other hand, previous works [4, 6] showed that buffers are often unutilized (i.e. idle or underutilized) especially when executing applications with non-uniform traffic characteristics. This is because, as shown in Fig. 1, the buffers are dedicated to specific input ports and only data-flows using the corresponding input port can exploit

Preprint submitted to Microprocessors and Microsystems -

the buffering resources. Buffer under-utilization leads to significant performance degradation [4, 6]. This is because the total network load is carried by only a portion of the total buffers in the network. Therefore, router architectures capable of better utilizing router buffers for performance gains and energy efficiency are necessary [4].



Figure 1: Mesh network with input-buffered router architecture.

In order to improve buffer resource utilization, we have presented the *Roundabout-NoC* (R-NoC) concept [7, 8]. The *R*-*NoC* router is inspired by real-life multi-lane traffic roundabouts and consist of multiple lanes shared by multiple input/output ports. This paper extends our initial work on asynchronous *Roundabout-NoC*, named *R-NoC-A* by providing deeper analysis of resource utilization impact on performance and power consumption. The router heavily relies on effective handshaking between different buffer stages assembled in lanes. AlDownload English Version:

# https://daneshyari.com/en/article/6885864

Download Persian Version:

https://daneshyari.com/article/6885864

Daneshyari.com