Cache coherence using local knowledge

Hiding memory latency is critical in modern machines. Typically, machines have used cache and addressed the ensuing cache coherence problem with hardware or VM-based strategies that rely on global inter-cache communication. However, global communication limits scalability. "Local knowledge"...

Full description

Bibliographic Details
Main Author: Darnell, Ervan
Format: Others
Language:English
Published: 2007
Subjects:
Online Access:http://hdl.handle.net/1911/19146
id ndltd-RICE-oai-scholarship.rice.edu-1911-19146
record_format oai_dc
spelling ndltd-RICE-oai-scholarship.rice.edu-1911-191462013-10-23T04:08:16ZCache coherence using local knowledgeDarnell, ErvanComputer ScienceHiding memory latency is critical in modern machines. Typically, machines have used cache and addressed the ensuing cache coherence problem with hardware or VM-based strategies that rely on global inter-cache communication. However, global communication limits scalability. "Local knowledge" coherence strategies, which use compile-time information to avoid run-time global communication, offer better scalability, but suffer additional cache misses. We develop a framework for understanding the relation of coherence strategies, previous and newly proposed. Within this framework, it is possible to define, independent of implementation considerations, an "ideal" local strategy with respect to cache hit rate. No local strategy could ever do better. For Fortran programs with readily analyzable subscripts, ideal local strategies achieve the same hit rates as global strategies. We develop three new local coherence strategies, CTV, TS1, and TS$\sp\prime$, designed to exploit minimal, aggressive, and reasonable hardware support, respectively. CTV is suitable for machines with no hardware assistance for cache coherence except the bare minimum of an exposed invalidate instruction. TS1 implements the abstract theorems of ideal local coherence as a concrete algorithm. Though the implementation is probably too expensive for a real implementation, TS1 is a vehicle for studying the limits of local coherence. TS$\sp\prime$ treats coherence over array sections as a graph coloring problem. So long as there are sufficient colors (realized as bits per cache line), TS$\sp\prime$ is an ideal local strategy. We found that four colors are adequate for many programs. When more colors are needed, TS$\sp\prime$ degrades gracefully. Its execution overheads are negligible and its hardware implementation costs moderate. Our data shows that TS$\sp\prime$ has better hit rates than the best previous local strategy, time-stamping, for nearly all programs, and thus better expected performance. Our data also shows that TS$\sp\prime$ achieves hit rates equal to global strategies for analyzable programs, and nearly so for partially analyzable programs. We indirectly compared the performance of TS$\sp\prime$ and a particular VM-style global strategy. TS$\sp\prime$ has better expected performance on our test suite. For machines without global coherence hardware, local strategies are an effective approach for an important class of programs.2007-08-21T01:46:11Z2007-08-21T01:46:11Z1997ThesisTextapplication/pdfhttp://hdl.handle.net/1911/19146eng
collection NDLTD
language English
format Others
sources NDLTD
topic Computer Science
spellingShingle Computer Science
Darnell, Ervan
Cache coherence using local knowledge
description Hiding memory latency is critical in modern machines. Typically, machines have used cache and addressed the ensuing cache coherence problem with hardware or VM-based strategies that rely on global inter-cache communication. However, global communication limits scalability. "Local knowledge" coherence strategies, which use compile-time information to avoid run-time global communication, offer better scalability, but suffer additional cache misses. We develop a framework for understanding the relation of coherence strategies, previous and newly proposed. Within this framework, it is possible to define, independent of implementation considerations, an "ideal" local strategy with respect to cache hit rate. No local strategy could ever do better. For Fortran programs with readily analyzable subscripts, ideal local strategies achieve the same hit rates as global strategies. We develop three new local coherence strategies, CTV, TS1, and TS$\sp\prime$, designed to exploit minimal, aggressive, and reasonable hardware support, respectively. CTV is suitable for machines with no hardware assistance for cache coherence except the bare minimum of an exposed invalidate instruction. TS1 implements the abstract theorems of ideal local coherence as a concrete algorithm. Though the implementation is probably too expensive for a real implementation, TS1 is a vehicle for studying the limits of local coherence. TS$\sp\prime$ treats coherence over array sections as a graph coloring problem. So long as there are sufficient colors (realized as bits per cache line), TS$\sp\prime$ is an ideal local strategy. We found that four colors are adequate for many programs. When more colors are needed, TS$\sp\prime$ degrades gracefully. Its execution overheads are negligible and its hardware implementation costs moderate. Our data shows that TS$\sp\prime$ has better hit rates than the best previous local strategy, time-stamping, for nearly all programs, and thus better expected performance. Our data also shows that TS$\sp\prime$ achieves hit rates equal to global strategies for analyzable programs, and nearly so for partially analyzable programs. We indirectly compared the performance of TS$\sp\prime$ and a particular VM-style global strategy. TS$\sp\prime$ has better expected performance on our test suite. For machines without global coherence hardware, local strategies are an effective approach for an important class of programs.
author Darnell, Ervan
author_facet Darnell, Ervan
author_sort Darnell, Ervan
title Cache coherence using local knowledge
title_short Cache coherence using local knowledge
title_full Cache coherence using local knowledge
title_fullStr Cache coherence using local knowledge
title_full_unstemmed Cache coherence using local knowledge
title_sort cache coherence using local knowledge
publishDate 2007
url http://hdl.handle.net/1911/19146
work_keys_str_mv AT darnellervan cachecoherenceusinglocalknowledge
_version_ 1716610233586941952