Summary: | 碩士 === 國立交通大學 === 資訊工程系 === 87 === It has been estimated that the performance of the fastest available microprocessors is increasing at approximately 50% per year, while the speed of memory systems has been growing at only about 5% to 10% per year. Since all the data needed by CPU are provided by memory, the overall performance of CPU is degraded due to latency of memory. Certainly, memory access is a major bottleneck in high-performance computer systems. Therefore, it is very important to improve the performance of memory system.
The use of cache reduces speed gap between processor and memory. Moreover, prefetching on caches can further reduce memory latency. About prefetch mechanism, there is many papers covering this subject. Some are software-controlled prefetching, and some are hardware-based prefetching. In this thesis, we pay our attention on hardware prefetching.
In this thesis, we will propose two Branch-Directed prefetching techniques, since current advanced branch prediction mechanisms are already part of the architecture. The branch predictors in current microprocessors reduce the stall time due to instruction fetching and, in general, can achieve prediction accuracy as high as 95% for SPEC benchmarks. Based on the accurate outcome of branch predictor, we can predict the next memory reference. The simulation results show that our branch-directed prefetcher has high accuracy, but low coverage. So, we propose three hybrid mechanisms to improve the overall performance. According to simulation results, we obtain a good prefetch design showing that the average improvement of execution cycle is 11.46%, which is better than the average improvement of 7.26% by simply doubling the cache size. Moreover, our design has lower hardware cost.
We simulate our design by using the SimpleScalar tool set. It is an execution-driven simulator. This tool is developed by University of Wisconsin, Madison. The benchmark programs are subset of SPEC95.
|