دانلود رایگان مقاله: تجزیه و تحلیل عملکرد GPU یک روش Galerkin متناوب نوتیک برای مدل های صوتی و الاستیک

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
506805	865045	2016	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

GPU performance analysis of a nodal discontinuous Galerkin method for acoustic and elastic models

ترجمه فارسی عنوان

تجزیه و تحلیل عملکرد GPU یک روش Galerkin متناوب نوتیک برای مدل های صوتی و الاستیک

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

المان محدود؛ ناپیوسته گالرکین؛ امواج لرزه ای؛ دامنه زمان؛ GPU؛ BLAS؛ پروفایل

BLAS Finite element - المان محدود seismic waves - امواج لرزه ای Time domain - دامنه زمانی Discontinuous Galerkin - روش گالرکین ناپیوسته GPU - واحد پردازش گرافیکی Profiling - پروفایل

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش مقاله

تجزیه و تحلیل عملکرد GPU یک روش Galerkin متناوب نوتیک برای مدل های صوتی و الاستیک

چکیده انگلیسی

• Several GPU implementations for time-domain wave simulations are compared.
• The numerical schemes are based on a high-order discontinuous finite element method.
• The implementations are profiled using the roofline model to highlight bottlenecks.
• The best implementation depends on the polynomial degree of the basis functions.

Finite element schemes based on discontinuous Galerkin methods possess features amenable to massively parallel computing accelerated with general purpose graphics processing units (GPUs). However, the computational performance of such schemes strongly depends on their implementation. In the past, several implementation strategies have been proposed. They are based exclusively on specialized compute kernels tuned for each operation, or they can leverage BLAS libraries that provide optimized routines for basic linear algebra operations. In this paper, we present and analyze up-to-date performance results for different implementations, tested in a unified framework on a single NVIDIA GTX980 GPU. We show that specialized kernels written with a one-node-per-thread strategy are competitive for polynomial bases up to the fifth and seventh degrees for acoustic and elastic models, respectively. For higher degrees, a strategy that makes use of the NVIDIA cuBLAS library provides better results, able to reach a net arithmetic throughput 35.7% of the theoretical peak value.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Geosciences - Volume 91, June 2016, Pages 64–76

نویسندگان

A. Modave, A. St-Cyr, T. Warburton,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : تجزیه و تحلیل عملکرد GPU یک روش Galerkin متناوب نوتیک برای مدل های صوتی و الاستیک

دسترسی سریع

ارتباط

English Website