کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
524342 | 868615 | 2012 | 17 صفحه PDF | دانلود رایگان |

Compressed sensing (CS) is a revolutionary signal acquisition theory, enabling signal acquisition at a rate that is below the Nyquist sampling rate. However, CS signal reconstruction algorithms are computationally expensive. One of the key computation steps in CS algorithms is to iteratively compute a Cholesky decomposition. Modern application acceleration devices, such as FPGAs and GPUs, can accelerate Cholesky decomposition and CS signal reconstruction computation. This paper presents high performance parallel Cholesky decomposition algorithms for GPU and FPGA implementation. For GPUs, an optimized Cholesky decomposition algorithm is developed with high parallelism, reduced data copying, and improved memory access. For FPGAs, a dedicated pipelined hardware architecture for Cholesky decomposition is designed. Only one pipelined triangular linear equation solver is needed for solving Cholesky decomposition and Cholesky decomposition-based linear equation systems. Moreover, CS signal reconstruction algorithms are accelerated on GPUs and FPGAs for fast signal recovery based on our iterative Cholesky decomposition. Results show that the proposed Cholesky decomposition on FPGAs and GPUs are much faster than LAPACK and MAGMA for small matrices. For accelerating CS signal reconstruction algorithms, our FPGA implementation can achieve around 15× speedup and our GPU implementation can achieve about a 38× speedup compared with the CPU using LAPACK and the hybrid CPU/GPU system with MAGMA.
► We develop optimized Cholesky decomposition implementations for compressed sensing.
► Application of Cholesky to compressed sensing enables data reuse that improves speed.
► A deeply pipelined Cholesky core for FPGAs achieves 15× speedup.
► The optimized GPU Cholesky code achieves up to 38× speedup over LAPACK or MAGMA.
Journal: Parallel Computing - Volume 38, Issue 8, August 2012, Pages 421–437