Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6915877 | Computer Methods in Applied Mechanics and Engineering | 2016 | 18 Pages |
Abstract
As computing hardware evolves, increasing core counts mean that memory bandwidth is becoming the deciding factor in attaining peak performance of numerical methods. High-order finite element methods, such as those implemented in the spectral/hp framework Nektar++, are particularly well-suited to this environment. Unlike low-order methods that typically utilise sparse storage, matrices representing high-order operators have greater density and richer structure. In this paper, we show how these qualities can be exploited to increase runtime performance on nodes that comprise a typical high-performance computing system, by amalgamating the action of key operators on multiple elements into a single, memory-efficient block. We investigate different strategies for achieving optimal performance across a range of polynomial orders and element types. As these strategies all depend on external factors such as BLAS implementation and the geometry of interest, we present a technique for automatically selecting the most efficient strategy at runtime.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science Applications
Authors
D. Moxey, C.D. Cantwell, R.M. Kirby, S.J. Sherwin,