کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
523987 | 868538 | 2011 | 20 صفحه PDF | دانلود رایگان |

We investigate the efficient iterative solution of large-scale sparse linear systems on shared-memory multiprocessors. Our parallel approach is based on a multilevel ILU preconditioner which preserves the mathematical semantics of the sequential method in ILUPACK. We exploit the parallelism exposed by the task tree corresponding to the nested dissection hierarchy (task parallelism), employ dynamic scheduling of tasks to processors to improve load balance, and formulate all stages of the parallel PCG method conformal with the computation of the preconditioner to increase data reuse. Results on a CC-NUMA platform with 16 processors reveal the parallel efficiency of this solution.
Research highlights
► Parallel algorithm for multilevel ILU preconditioning on shared-memory machines.
► Approach based on interleaving nested dissection and preconditioner hierarchies.
► Dynamic scheduling of the exposed task-tree parallelism to improve load-balancing.
► Preconditioned iteration convergence rate mildly depends on the number of processors.
► Remarkable parallel speedups for regular and irregular SPD sparse linear systems.
Journal: Parallel Computing - Volume 37, Issue 3, March 2011, Pages 183–202