Summary: | 碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 95 === The conventional load/store queue (LSQ) is a CAM structure where a dynamically-scheduled processor stores all in-flight memory instructions and conducts fully associative, age-prioritized searches to maintain dependencies and perform forwarding. LSQ is neither efficient since previous studies have shown that dependency violations are infrequent, nor scalable due to the complexity of the CAM. This paper presents an efficient and scalable alternative to the LSQ, called the set-associative load/store cache (LSC), that replaces the CAM with a set-associative tag array. It is analogous to substituting a set-associative cache for a fully associative cache, since the tag bit cell of a fully-associative array is a CAM. As it has been observed that set-associative caches can significantly reduce tag comparisons while approximating the miss rates of fully associative caches, LSC can substantially lessen the search bandwidth demand without incurring noticeable performance degradation due to stalls caused by set conflicts. Experimental results of SPECint2000 benchmarks show that both a 32-entry and a 128-entry 4-way set-associative LSC can significantly reduce the search bandwidth demand with no visible performance penalties, while a 128-entry L0 LSC can improve the average execution times by 3%.
|