Article ID Journal Published Year Pages File Type
486690 Procedia Computer Science 2012 10 Pages PDF
Abstract

With the increase in the processing core counts on modern computing platforms, the main memory accesses present a considerable execution bottleneck, leading to poor scalability in multithreaded applications. Even when the memory is physically divided into separate banks, each associated with a set of cores, i.e., exhibiting the so called nonuniform memory access (NUMA) architecture, the access time to the shared data structures may be detrimental to the scalability. Hence, it is imperative to carefully map large shared arrays to speciļ¬c memory banks based on the nature of the computation and the multithreaded parallelism characteristics. This paper describes memory-pinning strategies pertinent to sparse matrix-vector multiplication and vector orthogonalization phases of an ab initio nuclear structure computation performed by the MFDn package. Several nuclei and nuclear interactions were considered in the large-scale test cases with the dimensions of the sparse symmetric matrices ranging from 32 million to 320 million. Performance gains of up to 25% were observed with the proposed strategies as compared to the default memory placement policy.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)