Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
488162 | Procedia Computer Science | 2011 | 10 Pages |
Sparse scientific codes face grave performance challenges as memory bandwidth limitations grow on multi-core architectures. We investigate the memory behavior of a key sparse scientific kernel and study model-driven performance evaluation in this scope. We propose the Coupled Reuse-Cache Model (CRC Model), to enable multilevel cache performance analysis of parallel sparse codes. Our approach builds separate probabilistic application and hardware models, which are coupled to discover unprecedented insight into software-hardware interactions in the cache hierarchy. We evaluate our model's predictive performance with the pervasive sparse matrix-vector product kernel, using 1 to 16 cores and multiple cache configurations. For multi-core setups, average L1 and L2 prediction errors are within 3% and 6% respectively.