کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6874351 | 1441159 | 2018 | 26 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Optimized sparse Cholesky factorization on hybrid multicore architectures
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
We present techniques for supernodal sparse Cholesky factorization on a hybrid multicore platform consisting of a multicore CPU and GPU. The techniques are the subtree algorithm, pipelining and multithreading. The subtree algorithm [15] minimizes PCIe transmissions by storing an entire branch of the elimination tree in the GPU memory (the elimination tree is a tree data structure describing the workflow of the factorization), and also reduces the total kernel launch time by launching BLAS kernels in batches. The pipelining technique overlaps the execution of GPU kernels and PCIe data transfers. The multithreading technique [17] creates multiple threads for both the CPU and the GPU, to utilize concurrency of the elimination tree. Our experimental results on a platform consisting of an Intel multicore processor along with an Nvidia GPU indicate a significant improvement in performance and energy over CHOLMOD (SuiteSparse 4.5.3), a sparse algorithm, after these techniques are applied.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computational Science - Volume 26, May 2018, Pages 246-253
Journal: Journal of Computational Science - Volume 26, May 2018, Pages 246-253
نویسندگان
Meng Tang, Mohamed Gadou, Steven Rennich, Timothy A. Davis, Sanjay Ranka,