Energy-efficient mechanisms for managing on-chip storage in throughput processors

Modern computer systems are power or energy limited. While the number of transistors per chip continues to increase, classic Dennard voltage scaling has come to an end. Therefore, architects must improve a design's energy efficiency to continue to increase performance at historical rates, whi...

Full description

Bibliographic Details
Main Author: Gebhart, Mark Alan
Format: Others
Language:English
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/2152/ETD-UT-2012-05-5141
id ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-ETD-UT-2012-05-5141
record_format oai_dc
spelling ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-ETD-UT-2012-05-51412015-09-20T17:06:53ZEnergy-efficient mechanisms for managing on-chip storage in throughput processorsGebhart, Mark AlanEnergy efficiencyMulti-threadingRegister file organizationThroughput computingModern computer systems are power or energy limited. While the number of transistors per chip continues to increase, classic Dennard voltage scaling has come to an end. Therefore, architects must improve a design's energy efficiency to continue to increase performance at historical rates, while staying within a system's power limit. Throughput processors, which use a large number of threads to tolerate memory latency, have emerged as an energy-efficient platform for achieving high performance on diverse workloads and are found in systems ranging from cell phones to supercomputers. This work focuses on graphics processing units (GPUs), which contain thousands of threads per chip. In this dissertation, I redesign the on-chip storage system of a modern GPU to improve energy efficiency. Modern GPUs contain very large register files that consume between 15%-20% of the processor's dynamic energy. Most values written into the register file are only read a single time, often within a few instructions of being produced. To optimize for these patterns, we explore various designs for register file hierarchies. We study both a hardware-managed register file cache and a software-managed operand register file. We evaluate the energy tradeoffs in varying the number of levels and the capacity of each level in the hierarchy. Our most efficient design reduces register file energy by 54%. Beyond the register file, GPUs also contain on-chip scratchpad memories and caches. Traditional systems have a fixed partitioning between these three structures. Applications have diverse requirements and often a single resource is most critical to performance. We propose to unify the register file, primary data cache, and scratchpad memory into a single structure that is dynamically partitioned on a per-kernel basis to match the application's needs. The techniques proposed in this dissertation improve the utilization of on-chip memory, a scarce resource for systems with a large number of hardware threads. Making more efficient use of on-chip memory both improves performance and reduces energy. Future efficient systems will be achieved by the combination of several such techniques which improve energy efficiency.text2012-07-05T19:06:59Z2012-07-05T19:06:59Z2012-052012-07-05May 20122012-07-05T19:07:39Zthesisapplication/pdfhttp://hdl.handle.net/2152/ETD-UT-2012-05-51412152/ETD-UT-2012-05-5141eng
collection NDLTD
language English
format Others
sources NDLTD
topic Energy efficiency
Multi-threading
Register file organization
Throughput computing
spellingShingle Energy efficiency
Multi-threading
Register file organization
Throughput computing
Gebhart, Mark Alan
Energy-efficient mechanisms for managing on-chip storage in throughput processors
description Modern computer systems are power or energy limited. While the number of transistors per chip continues to increase, classic Dennard voltage scaling has come to an end. Therefore, architects must improve a design's energy efficiency to continue to increase performance at historical rates, while staying within a system's power limit. Throughput processors, which use a large number of threads to tolerate memory latency, have emerged as an energy-efficient platform for achieving high performance on diverse workloads and are found in systems ranging from cell phones to supercomputers. This work focuses on graphics processing units (GPUs), which contain thousands of threads per chip. In this dissertation, I redesign the on-chip storage system of a modern GPU to improve energy efficiency. Modern GPUs contain very large register files that consume between 15%-20% of the processor's dynamic energy. Most values written into the register file are only read a single time, often within a few instructions of being produced. To optimize for these patterns, we explore various designs for register file hierarchies. We study both a hardware-managed register file cache and a software-managed operand register file. We evaluate the energy tradeoffs in varying the number of levels and the capacity of each level in the hierarchy. Our most efficient design reduces register file energy by 54%. Beyond the register file, GPUs also contain on-chip scratchpad memories and caches. Traditional systems have a fixed partitioning between these three structures. Applications have diverse requirements and often a single resource is most critical to performance. We propose to unify the register file, primary data cache, and scratchpad memory into a single structure that is dynamically partitioned on a per-kernel basis to match the application's needs. The techniques proposed in this dissertation improve the utilization of on-chip memory, a scarce resource for systems with a large number of hardware threads. Making more efficient use of on-chip memory both improves performance and reduces energy. Future efficient systems will be achieved by the combination of several such techniques which improve energy efficiency. === text
author Gebhart, Mark Alan
author_facet Gebhart, Mark Alan
author_sort Gebhart, Mark Alan
title Energy-efficient mechanisms for managing on-chip storage in throughput processors
title_short Energy-efficient mechanisms for managing on-chip storage in throughput processors
title_full Energy-efficient mechanisms for managing on-chip storage in throughput processors
title_fullStr Energy-efficient mechanisms for managing on-chip storage in throughput processors
title_full_unstemmed Energy-efficient mechanisms for managing on-chip storage in throughput processors
title_sort energy-efficient mechanisms for managing on-chip storage in throughput processors
publishDate 2012
url http://hdl.handle.net/2152/ETD-UT-2012-05-5141
work_keys_str_mv AT gebhartmarkalan energyefficientmechanismsformanagingonchipstorageinthroughputprocessors
_version_ 1716822479761047552