کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
542676 1450233 2016 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Parallel GMRES solver for fast analysis of large linear dynamic systems on GPU platforms
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر سخت افزارها و معماری
پیش نمایش صفحه اول مقاله
Parallel GMRES solver for fast analysis of large linear dynamic systems on GPU platforms
چکیده انگلیسی


• The paper proposed a novel parallel GMRES based iterative solver on GPU platforms.
• It has two key techniques: one is the GPU-enabled GMRES solver and the second is a novel sparse vector and matrix multiplication (spMV) algorithm implemented on GPUs.
• We also implemented the multi-GPU version of the proposed segSpMV algortithm for further speedup.
• The resulting parallel GMRES solver leads to order of magnitudes speedup over the CPU version of the iterative solvers and the director LU factorization solvers.

In this paper, we propose an efficient parallel dynamic linear solver, called GPU-GMRES, for transient analysis of large linear dynamic systems such as large power grid networks. The new method is based on the preconditioned generalized minimum residual (GMRES) iterative method implemented on heterogeneous CPU–GPU platforms. The new solver is very robust and can be applied to power grids with different structures as well as for general analysis problems for large linear dynamic systems with asymmetric matrices. The proposed GPU-GMRES solver adopts the very general and robust incomplete LU based preconditioner. We show that by properly selecting the right amount of fill-ins in the incomplete LU factors, a good trade-off between GPU efficiency and convergence rate can be achieved for the best overall performance. Such tunable feature can make this algorithm very adaptive to different problems. GPU-GMRES solver properly partitions the major computing tasks in GMRES solver to minimize the data traffic between CPU and GPUs to enhance performance of the proposed method. Furthermore, we propose a new fast parallel sparse matrix–vector (SpMV) multiplication algorithm to further accelerate the GPU-GMRES solver. The new algorithm, called segSpMV, can enjoy full coalesced memory access compared to existing approaches. To further improve the scalability and efficiency, segSpMV method is further extended to multi-GPU platforms, which leads to more scalable and faster multi-GPU GMRES solver. Experimental results on the set of the published IBM benchmark circuits and mesh-structured power grid networks show that the GPU-GMRES solver can deliver order of magnitudes speedup over the direct LU solver, UMFPACK. The resulting multi-GPU-GMRES can also deliver 3–12× speedup over the CPU implementation of the same GMRES method on transient analysis.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Integration, the VLSI Journal - Volume 52, January 2016, Pages 10–22
نویسندگان
, , , , , ,