Stacked-Bloch-wave electron diffraction simulations using GPU acceleration

Article ID	Journal	Published Year	Pages	File Type
1677433	Ultramicroscopy	2014	6 Pages	PDF

Abstract

•Bloch-wave and stacked-Bloch-wave calculations can be accelerated with GPUs.•Direct approximation of the matrix exponential can be faster than diagonalization.•GPU-based direct approximation can be ≈70× faster than CPU diagonalization.•Larger matrices benefit more from this approach than smaller ones.•Stacked-Bloch-wave scattering results are functionally identical to diagonalization.

In this paper, we discuss the advantages for Bloch-wave simulations performed using graphics processing units (GPUs), based on approximating the matrix exponential directly instead of performing a matrix diagonalization. Our direct matrix-exponential algorithm yields a functionally identical electron scattering matrix to that generated with matrix diagonalization. Using the matrix-exponential scaling-and-squaring method with a Padé approximation, direct GPU-based matrix-exponential double-precision calculations are up to 20× faster than CPU-based calculations and up to approximately 70× faster than matrix diagonalization. We compare precision and runtime of scaling and squaring methods with either the Padé approximation or a Taylor expansion. We also discuss the stacked-Bloch-wave method, and show that our stacked-Bloch-wave implementation yields the same electron scattering matrix as traditional Bloch-wave matrix diagonalization.

Keywords

GPU programming Electron diffraction Parallel processing