کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
524668 | 868818 | 2011 | 12 صفحه PDF | دانلود رایگان |

Multifrontal is an efficient direct method for solving large-scale sparse and unsymmetric linear systems. The method transforms a large sparse matrix factorization process into a sequence of factorizations involving smaller dense frontal matrices. Some of these dense operations can be accelerated by using a graphic processing unit (GPU). We analyze the unsymmetric multifrontal method from both an algorithmic and implementational perspective to see how a GPU, in particular the NVIDIA Tesla C2070, can be used to accelerate the computations. Our main accelerating strategies include (i) performing BLAS on both CPU and GPU, (ii) improving the communication efficiency between the CPU and GPU by using page-locked memory, zero-copy memory, and asynchronous memory copy, and (iii) a modified algorithm that reuses the memory between different GPU tasks and sets thresholds to determine whether certain tasks be performed on the GPU. The proposed acceleration strategies are implemented by modifying UMFPACK, which is an unsymmetric multifrontal linear system solver. Numerical results show that the CPU–GPU hybrid approach can accelerate the unsymmetric multifrontal solver, especially for computationally expensive problems.
► We use CPU–GPU system to accelerate unsymmetric multifrontal linear system solver.
► BLAS operations are performed on CPU or GPU adaptively.
► Proposed CPU–GPU communication strategies can enhance data transference efficiency.
► Proposed GPU memory usage strategies can improve the computational performance.
► Numerical experiments on NVIDIA C2070 GPU assert the speedups due to the strategies.
Journal: Parallel Computing - Volume 37, Issue 12, December 2011, Pages 759–770