Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
468240 | Computers & Mathematics with Applications | 2013 | 12 Pages |
Several possibilities exist to implement the propagation step of lattice Boltzmann methods. This paper describes common implementations and compares the number of memory transfer operations they require per lattice node update. A performance model based on the memory bandwidth is then used to obtain an estimation of the maximum achievable performance on different machines. A subset of the discussed implementations of the propagation step are benchmarked on different Intel- and AMD-based compute nodes using the framework of an existing flow solver that is specially adapted to simulate flow in porous media, and the model is validated against the measurements. Advanced approaches for the propagation step like “A–A pattern” or “Esoteric Twist” require more programming effort but often sustain significantly better performance than non-naïve but straightforward implementations.