Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
455473 | Computers & Electrical Engineering | 2012 | 15 Pages |
There have been numerous techniques proposed in the literature that aim to improve the performance of cache memories by reducing cache conflicts. These techniques were proposed over the past decade and each proposal independently claimed to reduce conflict misses. However, because the published results used different benchmarks and different experimental setups, it is not easy to compare them. In this paper we report a side-by-side comparison of these techniques. We also evaluate the suitability of some of these techniques for caches with higher set associativities. In addition to evaluating techniques for their impact on cache misses and average memory access times, we also evaluate the techniques for their ability in reducing the non-uniformity of cache accesses.The conclusion of our work is that, each application may benefit from a different technique and no single scheme works universally well for all applications. We also observe that, for the majority of applications, XORing (XOR) and Odd-multiplier indexing schemes perform reasonably well. Among programmable associativity techniques, B-cache performs better than column-associative and adaptive-caches, but column-associative caches require very minimal extensions to hardware. Uniformity of cache accesses is improved most by B-cache technique while column-associative cache also improves cache access uniformities.Based on the observation that different techniques benefit different applications, we explored the use of multiple, programmable addressing mechanisms, each addressing scheme designed for a specific application. We include some preliminary data using multiple addressing schemes.
Graphical abstractCaches for multicore systems with multiple addressing schemes result in better cache performance.Figure optionsDownload full-size imageDownload as PowerPoint slideHighlights► Cache memories are accessed non-uniformly causing significantly more conflicts on a few cache lines. ► Changing how caches are addressed and relocating addresses to less used cache lines improve performance. ► We show side-by-side comparison of several different techniques. ► The performance gained by these techniques are application dependent. ► Caches for multicore systems with multiple addressing schemes result in better cache performance.