کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
523836 868503 2016 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Exploiting task and data parallelism in ILUPACK’s preconditioned CG solver on NUMA architectures and many-core accelerators
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Exploiting task and data parallelism in ILUPACK’s preconditioned CG solver on NUMA architectures and many-core accelerators
چکیده انگلیسی


• Specialized implementations of ILUPACK’s iterative solver for NUMA platforms.
• Specialized implementations of ILUPACK’s iterative solver for many-core accelerators.
• Exploitation of task parallelism via OmpSs runtime (dynamic schedule).
• Exploitation of task parallelism via MPI (static schedule).
• Exploitation of data parallelism for GPUs.

We present specialized implementations of the preconditioned iterative linear system solver in ILUPACK for Non-Uniform Memory Access (NUMA) platforms and many-core hardware co-processors based on the Intel Xeon Phi and graphics accelerators. For the conventional x86 architectures, our approach exploits task parallelism via the OmpSs runtime as well as a message-passing implementation based on MPI, respectively yielding a dynamic and static schedule of the work to the cores, with different numeric semantics to those of the sequential ILUPACK. For the graphics processor we exploit data parallelism by off-loading the computationally expensive kernels to the accelerator while keeping the numeric semantics of the sequential case.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 54, May 2016, Pages 97–107
نویسندگان
, , , , , , ,