کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
756407 1462700 2015 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Scalable multi-relaxation-time lattice Boltzmann simulations on multi-GPU cluster
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی مکانیک محاسباتی
پیش نمایش صفحه اول مقاله
Scalable multi-relaxation-time lattice Boltzmann simulations on multi-GPU cluster
چکیده انگلیسی


• Scalable multi relaxation time lattice Boltzmann method on multi graphic processor units is proposed.
• Using on-chip memory introduces three to six folds performance increase over its global memory counterpart.
• Streaming using offset reading is much better than adopting offset writing.
• Overlapping communication and computation can achieve 38% performance improvement.
• Three GTX Titans deliver 5000 MLUPS for 1923 grids with 12 Tesla M2070 attaining half performance.

In this paper, the D3Q19 multi-relaxation-time lattice Boltzmann model is adopted to simulate three-dimensional cavity flows using graphic processing units (GPUs). For single GPU computations, utilizing on-chip memory generates three to five times speedup over adopting global memory alone. Also, streaming using offset reading attains another two times speedup over employing offset writing. For Message Passing Interface (MPI) based multi-GPU computations, overlapping communication and computation can achieve 38% improvement and provide an efficient scheme to improve the scalability and its performance. Numerical experiments show that 12 TeslaTM M2070 GPUs produce around 5500 million lattices updates per second (MLUPS) using 57635763 grid. On the other hand, three GTX Titans deliver 5000 MLUPS for 19231923 grids, while 12 Tesla attain half performance.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Fluids - Volume 110, 30 March 2015, Pages 1–8
نویسندگان
, , , ,