Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6885906 | Microprocessors and Microsystems | 2018 | 9 Pages |
Abstract
The fused-multiply-add (FMA) instruction is a common instruction in RISC processors since 1990. A 3-stage, 8-level pipelined, dual-precision FMA is proposed here that can perform operations either at one double precision (SISD) or at two single precision in parallel (SIMD). The 53-bit mantissa-multiplier (MM) is optimally segmented by Karatsuba-Offman (KO) algorithm such that both modes can be performed. The 6-stage pipelined MM uses only 6 of 10 multipliers and 13 of 33 adder/subtractors in SIMD. Thus hardware area of the proposed MM is reduced by 23.82% and throughput is maintained to be 923M samples/s. The arithmetic operational units in the data path are shared among the modes by having four data rearrangement units (DRU) which rearranges the data systematically at the input, the outputs of MM and the final output. Though these DRUs bring some hardware overhead, the resulting architecture is modular and uniform for both modes of computation. The proposed FMA has been implemented using TSMC 1P6M CMOS 130â¯nm library and takes 48% less overall area and consumes 49% less power at 308.7â¯MHz compared to previous results. The area-delay-product (ADP), 0.48â¯Ãâ¯10â15 shows that the area optimization by proposed KO based MM can also keep the computation time as 3.24â¯ns.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Networks and Communications
Authors
V. Arunachalam, Alex Noel Joseph Raj, Naveen Hampannavar, C.B. Bidul,