Summary: | A key challenge in architecting a multicore processor is efficiently maintaining cache coherence. Directory protocols offer a scalable, bandwidth-efficient solution to this problem, but unfortunately they incur significant area overheads. This
dissertation proposes three novel coherence directory designs that address the challenge of maintaining coherence in multicore processors, while reducing the area and energy overheads of the directory structure.
Firstly, I propose the Phantom directory that leverages the abundance of storage in large shared caches to reduce the area devoted to a dedicated coherence directory. This approach faces a significant challenge since an access to the shared cache typically requires more energy than for a smaller dedicated directory. Phantom attempts to overcome this challenge by exploiting the spatial locality common to most applications, and by utilizing a very small dedicated directory cache, but the costs of accessing the shared cache still outweigh Phantom's area savings.
Building upon the simple observation that at any point in time, large, continuous chunks of memory are often accessed by only a single core, my second proposed design, the multi-grain directory (MGD), takes advantage of this common application behaviour to reduce the directory size by tracking coherence at multiple different granularities. I demonstrate that a practical dual-grain directory (DGD) provides a robust solution, reducing directory area by 41% while maintaining good performance across a variety of workloads.
While MGD provides a practical approach to reducing directory area, my third proposed design, the Tagless directory, takes a more innovative approach to achieving true scalability. Tagless embraces imprecision by embedding sharing information in a number of space-efficient Bloom filters. Careful consideration produces an elegant design with robust performance comparable to an ideal coherence directory. For a sixteen core processor, Tagless reduces
directory area by up to 70% while reducing cache and directory energy consumption. My analysis also indicates that Tagless continues to provide an area and energy efficient directory as processors scale to tens or even hundreds of cores.
These three innovative designs advance the state-of-the-art by providing more area and energy efficient coherence directories to allow multicore processors to scale to tens or hundreds of cores.
|