کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4960991 1446507 2017 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures
چکیده انگلیسی

This paper presents new algorithmic approaches and optimization techniques for LU factorization and matrix inversion of millions of very small matrices using GPUs. These problems appear in many scientific applications including astrophysics and generation of block-Jacobi preconditioners. We show that, for very small problem sizes, design and optimization of GPU kernels require a mindset different from the one usually used when designing LAPACK algorithms for GPUs. Techniques for optimal memory traffic, register blocking, and tunable concurrency are incorporated in our proposed design. We also take advantage of the small matrix sizes to eliminate the intermediate row interchanges in both the factorization and inversion kernels. The proposed GPU kernels achieve performance speedups vs. CUBLAS of up to 6× for the factorization, and 14× for the inversion, using double precision arithmetic on a Pascal P100 GPU.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 108, 2017, Pages 606-615
نویسندگان
, , , ,