کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
461216 | 1364717 | 2016 | 10 صفحه PDF | دانلود رایگان |
• Aiming towards permanent link failure on NoCs, we propose an integrated software-hardware framework to provide reliability. A MPI-like fault-tolerant communication is introduced to detect link failure at runtime and automatically start healthy path exploration.
• Our fault-tolerant routing algorithm is based on region flooding, which obtains better fault-tolerant ability and less extra network latency.
• And there are no restrictions on the quantity of the fault links and their location.
Aggressive scaling of the CMOS process technology allows the fabrication of highly integrated chips, and enables the design of the network-on-chip (NoC). However, it also leads to widespread reliability problems. A reliable NoC system must operate normally even in the face of a lot of transistor failures. Aiming towards permanent faults on communication links, we introduce a fault-tolerant MPI-like communication protocol. It detects the link failure if there exist unresponsive requests and automatically starts the new path exploration. The region flooding algorithm is proposed to search for a fault-free path and reroute packets to avoid system stalls. The experimental result shows our approach significantly reduces the latency compared with the basic flooding algorithm. The maximum latency reduction is 25% under the bit complement traffic pattern. Also, it brings only 2% fault tolerance loss.
Journal: Microprocessors and Microsystems - Volume 45, Part A, August 2016, Pages 198–207