Effcient Shared-array Accesses in Ab Initio Nuclear Structure Calculations on Multicore Architectures

Article ID	Journal	Published Year	Pages	File Type
486690	Procedia Computer Science	2012	10 Pages	PDF

Abstract

With the increase in the processing core counts on modern computing platforms, the main memory accesses present a considerable execution bottleneck, leading to poor scalability in multithreaded applications. Even when the memory is physically divided into separate banks, each associated with a set of cores, i.e., exhibiting the so called nonuniform memory access (NUMA) architecture, the access time to the shared data structures may be detrimental to the scalability. Hence, it is imperative to carefully map large shared arrays to speciﬁc memory banks based on the nature of the computation and the multithreaded parallelism characteristics. This paper describes memory-pinning strategies pertinent to sparse matrix-vector multiplication and vector orthogonalization phases of an ab initio nuclear structure computation performed by the MFDn package. Several nuclei and nuclear interactions were considered in the large-scale test cases with the dimensions of the sparse symmetric matrices ranging from 32 million to 320 million. Performance gains of up to 25% were observed with the proposed strategies as compared to the default memory placement policy.