Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4956871 | Microprocessors and Microsystems | 2016 | 12 Pages |
Abstract
Utilizing hardware resources efficiently is vital to building the future generation of high-performance computing systems. The sparse matrix - dense vector multiplication (SpMV) kernel, which is notorious for its poor efficiency on conventional processors, is a key component in many scientific computing applications and increasing SpMV efficiency can contribute significantly to improving overall system efficiency. The major challenge in implementing SpMV efficiently is handling the input-dependent memory access patterns, and reconfigurable logic is a strong candidate for tackling this problem via memory system customization. In this work, we consider three schemes (all off-chip, all on-chip, caching) for servicing the irregular-access component of SpMV and investigate their effects on accelerator efficiency. To combine the strengths of on-chip and off-chip random accesses, we propose a hardware-software caching scheme named NCVCS that combines software preprocessing with a nonblocking cache to enable highly efficient SpMV accelerators with modest on-chip memory requirements. Our results from the comparison of the three schemes implemented as part of an FPGA SpMV accelerator show that our scheme effectively combines the high efficiency from on-chip accesses with the capability of working with large matrices from off-chip accesses.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Networks and Communications
Authors
Yaman Umuroglu, Magnus Jahre,