Article ID Journal Published Year Pages File Type
762579 Computers & Fluids 2012 6 Pages PDF
Abstract

In this paper, a hybrid programming (OpenMP, MPI and CUDA) approach is used to study the performance of a parallelized Dynamic Discrete Ordinate Method (DDOM) solver [1]. The parallel computation performances were compared under different scenarios. A hybrid parallelism of MPI and OpenMP performs well in terms of parallel efficiency (>90%) on a 64 core CPU cluster without using any load-balancing technique. This hybrid parallelism model is extended to a GPU cluster. By using massive multicore GPUs, the CUDA-accelerated code achieves a speed 250 times faster with a single GPU and over 780 times faster with a Quad-GPU cluster versus the identical process running on a single thread of CPU. Our results demonstrate that DDOM solver provides good scalability on CPU and GPU clusters.

Related Topics
Physical Sciences and Engineering Engineering Computational Mechanics
Authors
, , ,