Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6916278 | Computer Methods in Applied Mechanics and Engineering | 2016 | 22 Pages |
Abstract
The paper presents investigations on the performance of the finite element numerical integration algorithm for first order approximations and three processor architectures, popular in scientific computing, classical x86_64 CPU, Intel Xeon Phi and NVIDIA Kepler GPU. We base the discussion on theoretical performance models and our own implementations for which we perform a range of computational experiments. For the latter, we consider a unifying programming model and portable OpenCL implementation for all architectures. Variations of the algorithm due to different problems solved and different element types are investigated and several optimizations aimed at proper optimization and mapping of the algorithm to computer architectures are demonstrated. The experimental results show the varying levels of performance for different architectures, but indicate that the algorithm can be effectively ported to all of them. The conclusions indicate the factors that limit the performance for different problems and types of approximation and the performance ranges that can be expected for FEM numerical integration on different processor architectures.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science Applications
Authors
Krzysztof BanaÅ, Filip Krużel, Jan BielaÅski,