کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
524679 868824 2011 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A flexible Patch-based lattice Boltzmann parallelization approach for heterogeneous GPU–CPU clusters
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
A flexible Patch-based lattice Boltzmann parallelization approach for heterogeneous GPU–CPU clusters
چکیده انگلیسی

Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. We address this issue in the context of a lattice Boltzmann flow solver that is integrated in the WaLBerla software framework. Our multi-GPU implementation uses a block-structured MPI parallelization and is suitable for load balancing and heterogeneous computations on CPUs and GPUs. The overhead required for multi-GPU simulations is discussed in detail. It is demonstrated that a large fraction of the kernel performance can be sustained for weak scaling on InfiniBand clusters, leading to excellent parallel efficiency. However, in strong scaling scenarios using multiple GPUs is much less efficient than running CPU-only simulations on IBM BG/P and x86-based clusters. Hence, a cost analysis must determine the best course of action for a particular simulation task and hardware configuration. Finally we present weak scaling results of heterogeneous simulations conducted on CPUs and GPUs simultaneously, using clusters equipped with varying node configurations.


► We investigate performance and scaling behavior of a LBM solver on GPU–CPU clusters.
► Based on hardware models performance estimations for GPUs and CPUs are derived.
► Strong scaling multi-GPU experiments show significant communication overhead.
► Load balanced heterogeneous simulations of GPU and CPU can increase performance.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 37, Issue 9, September 2011, Pages 536–549
نویسندگان
, , , , , ,