Article ID Journal Published Year Pages File Type
433036 Journal of Parallel and Distributed Computing 2013 9 Pages PDF
Abstract

The error-resilient entropy coding (EREC) algorithm is an effective method for combating error propagation at low cost in many compression methods using variable-length coding (VLC). However, the main drawback of the EREC is its high complexity. In order to overcome this disadvantage, a parallel EREC is implemented on a graphics processing unit (GPU) using the NVIDIA CUDA technology. The original EREC is a finer-grained parallel at each stage which brings additional communication overhead. To achieve high efficiency of parallel EREC, we propose partitioning the EREC (P-EREC) algorithm, which splits variable-length blocks into groups and then every group is coded using the EREC separately. Each GPU thread processes one group so as to make the EREC coarse-grained parallel. In addition, some optimization strategies are discussed in order to obtain higher performance using the GPU. In the case that the variable-length data blocks are divided into 128 groups (256 groups, resp.), experimental results show that the parallel P-EREC achieves 32×32× to 123×123× (54×54× to 350×350×, resp.) speedup over the original C code of EREC compiled with the O2O2 optimization option. Higher speedup can even be obtained with more groups. Compared to the EREC, the P-EREC not only achieves a good speedup performance, but it also slightly improves the resilience of the VLC bit-stream against burst or random errors.

► We attempt to optimize the performance of the EREC by parallel processing. ► A parallel EREC is implemented on a graphics processing unit (GPU). ► We propose partitioning the EREC (P-EREC) algorithm. ► We implemented the parallel P-EREC on GPU and optimized techniques were fully used. ► The parallel P-EREC gains 32×32× to 123×123× speedup compared with the original C code.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , ,