Predictor Virtualization: Teaching Old Caches New Tricks

To improve application performance, current processors rely on prediction-based hardware optimizations, such as data prefetching and branch prediction. These hardware optimizations store application metadata in on-chip predictor tables and use the metadata to anticipate and optimize for future appli...

Full description

Bibliographic Details
Main Author:	Burcea, Ioana Monica
Other Authors:	Moshovos, Andreas
Language:	en_ca
Published:	2012
Subjects:	predictor virtualization hardware optimizations processor caches 0984 0544
Online Access:	http://hdl.handle.net/1807/32674

id	ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-32674
record_format	oai_dc
spelling	ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-326742013-04-19T19:57:49ZPredictor Virtualization: Teaching Old Caches New TricksBurcea, Ioana Monicapredictor virtualizationhardware optimizationsprocessor caches09840544To improve application performance, current processors rely on prediction-based hardware optimizations, such as data prefetching and branch prediction. These hardware optimizations store application metadata in on-chip predictor tables and use the metadata to anticipate and optimize for future application behavior. As application footprints grow, the predictor tables need to scale for predictors to remain effective. One important challenge in processor design is to decide which hardware optimizations to implement and how much resources to dedicate to a specific optimization. Traditionally, processor architects employ a one-size-fits-all approach when designing predictor-based hardware optimizations: for each optimization, a fixed portion of the on-chip resources is allocated to the predictor storage. This approach often leads to sub-optimal designs where: 1) resources are wasted for applications that do not benefit from a particular predictor or require only small predictor tables, or 2) predictors under-perform for applications that need larger predictor tables that can not be built due to area-latency-power constraints. This thesis introduces Predictor Virtualization (PV), a framework that uses the traditional processor memory hierarchy to store application metadata used in speculative hardware optimizations. This allows to emulate large, more accurate predictor tables, which, in return, leads to higher application performance. PV exploits the current trend of unprecedentedly large on- chip secondary caches and allocates on demand a small portion of the cache capacity to store application metadata used in hardware optimizations, adjusting to the application’s need for predictor resources. As a consequence, PV is a pay-as-you-go technique that emulates large predictor tables without increasing the dedicated storage overhead. To demonstrate the benefits of virtualizing hardware predictors, we present virtualized designs for three different hardware optimizations: a state-of-the-art data prefetcher, conventional branch target buffers and an object-pointer prefetcher. While each of these hardware predictors exhibit different characteristics that lead to different virtualized designs, virtualization improves the cost-performance trade-off for all these optimizations. PV increases the utility of traditional processor caches: in addition to being accelerators for slow off-chip memories, on-chip caches are leveraged for increasing the effectiveness of predictor-based hardware optimizations.Moshovos, Andreas2012-062012-08-20T18:33:45ZNO_RESTRICTION2012-08-20T18:33:45Z2012-08-20Thesishttp://hdl.handle.net/1807/32674en_ca
collection	NDLTD
language	en_ca
sources	NDLTD
topic	predictor virtualization hardware optimizations processor caches 0984 0544
spellingShingle	predictor virtualization hardware optimizations processor caches 0984 0544 Burcea, Ioana Monica Predictor Virtualization: Teaching Old Caches New Tricks
description	To improve application performance, current processors rely on prediction-based hardware optimizations, such as data prefetching and branch prediction. These hardware optimizations store application metadata in on-chip predictor tables and use the metadata to anticipate and optimize for future application behavior. As application footprints grow, the predictor tables need to scale for predictors to remain effective. One important challenge in processor design is to decide which hardware optimizations to implement and how much resources to dedicate to a specific optimization. Traditionally, processor architects employ a one-size-fits-all approach when designing predictor-based hardware optimizations: for each optimization, a fixed portion of the on-chip resources is allocated to the predictor storage. This approach often leads to sub-optimal designs where: 1) resources are wasted for applications that do not benefit from a particular predictor or require only small predictor tables, or 2) predictors under-perform for applications that need larger predictor tables that can not be built due to area-latency-power constraints. This thesis introduces Predictor Virtualization (PV), a framework that uses the traditional processor memory hierarchy to store application metadata used in speculative hardware optimizations. This allows to emulate large, more accurate predictor tables, which, in return, leads to higher application performance. PV exploits the current trend of unprecedentedly large on- chip secondary caches and allocates on demand a small portion of the cache capacity to store application metadata used in hardware optimizations, adjusting to the application’s need for predictor resources. As a consequence, PV is a pay-as-you-go technique that emulates large predictor tables without increasing the dedicated storage overhead. To demonstrate the benefits of virtualizing hardware predictors, we present virtualized designs for three different hardware optimizations: a state-of-the-art data prefetcher, conventional branch target buffers and an object-pointer prefetcher. While each of these hardware predictors exhibit different characteristics that lead to different virtualized designs, virtualization improves the cost-performance trade-off for all these optimizations. PV increases the utility of traditional processor caches: in addition to being accelerators for slow off-chip memories, on-chip caches are leveraged for increasing the effectiveness of predictor-based hardware optimizations.
author2	Moshovos, Andreas
author_facet	Moshovos, Andreas Burcea, Ioana Monica
author	Burcea, Ioana Monica
author_sort	Burcea, Ioana Monica
title	Predictor Virtualization: Teaching Old Caches New Tricks
title_short	Predictor Virtualization: Teaching Old Caches New Tricks
title_full	Predictor Virtualization: Teaching Old Caches New Tricks
title_fullStr	Predictor Virtualization: Teaching Old Caches New Tricks
title_full_unstemmed	Predictor Virtualization: Teaching Old Caches New Tricks
title_sort	predictor virtualization: teaching old caches new tricks
publishDate	2012
url	http://hdl.handle.net/1807/32674
work_keys_str_mv	AT burceaioanamonica predictorvirtualizationteachingoldcachesnewtricks
_version_	1716582185443524608

Predictor Virtualization: Teaching Old Caches New Tricks

Similar Items