کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
539630 | 1450237 | 2015 | 11 صفحه PDF | دانلود رایگان |

• Our BTWC processor tolerates worst-case delays without impact on performance thanks to latency-insensitive design (LID) and variable-latency (VL) units.
• A fine-grain LID pipeline interlock handles stalls caused by control and data hazards, late memory access, and VL execution.
• Clock frequency increases by 23% in a 45-nm CMOS technology compared to the worst-case approach.
• Variable-latency penalties occur just in the worst process corner. No performance degradation ensues in common and best-case corners.
• A light-weight software built-in self-test (BIST) assigns two-cycle execution of critical instructions only when needed.
Variability of process parameters in nanometer CMOS circuits makes standard worst-case design methodology waste much of the advantages of scaling. A common-case design, though, is a perilous alternative, as it gives up much of the design yield. Better than worst-case (BTWC) design methodology reconciles performance and yield. In this paper we present a BTWC RISC processor that tolerates worst-case extra delays of critical paths without significant impact on the overall performance. We obtain this result by coupling latency-insensitive design and variable-latency (VL) units. A software built-in self-test checks VL units individually to determine whether to activate them or not. Compared to a worst-case approach, the RISC clock frequency increases by 23% in a 45 nm CMOS technology. The impact of VL on instructions per cycle is circumscribed to the worst process case only and very limited, as we show through a set of benchmarks.
Journal: Integration, the VLSI Journal - Volume 48, January 2015, Pages 72–82