Energy-efficient mechanisms for managing on-chip storage in throughput processors
Modern computer systems are power or energy limited. While the number of transistors per chip continues to increase, classic Dennard voltage scaling has come to an end. Therefore, architects must improve a design's energy efficiency to continue to increase performance at historical rates, whi...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
2012
|
Subjects: | |
Online Access: | http://hdl.handle.net/2152/ETD-UT-2012-05-5141 |
id |
ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-ETD-UT-2012-05-5141 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-ETD-UT-2012-05-51412015-09-20T17:06:53ZEnergy-efficient mechanisms for managing on-chip storage in throughput processorsGebhart, Mark AlanEnergy efficiencyMulti-threadingRegister file organizationThroughput computingModern computer systems are power or energy limited. While the number of transistors per chip continues to increase, classic Dennard voltage scaling has come to an end. Therefore, architects must improve a design's energy efficiency to continue to increase performance at historical rates, while staying within a system's power limit. Throughput processors, which use a large number of threads to tolerate memory latency, have emerged as an energy-efficient platform for achieving high performance on diverse workloads and are found in systems ranging from cell phones to supercomputers. This work focuses on graphics processing units (GPUs), which contain thousands of threads per chip. In this dissertation, I redesign the on-chip storage system of a modern GPU to improve energy efficiency. Modern GPUs contain very large register files that consume between 15%-20% of the processor's dynamic energy. Most values written into the register file are only read a single time, often within a few instructions of being produced. To optimize for these patterns, we explore various designs for register file hierarchies. We study both a hardware-managed register file cache and a software-managed operand register file. We evaluate the energy tradeoffs in varying the number of levels and the capacity of each level in the hierarchy. Our most efficient design reduces register file energy by 54%. Beyond the register file, GPUs also contain on-chip scratchpad memories and caches. Traditional systems have a fixed partitioning between these three structures. Applications have diverse requirements and often a single resource is most critical to performance. We propose to unify the register file, primary data cache, and scratchpad memory into a single structure that is dynamically partitioned on a per-kernel basis to match the application's needs. The techniques proposed in this dissertation improve the utilization of on-chip memory, a scarce resource for systems with a large number of hardware threads. Making more efficient use of on-chip memory both improves performance and reduces energy. Future efficient systems will be achieved by the combination of several such techniques which improve energy efficiency.text2012-07-05T19:06:59Z2012-07-05T19:06:59Z2012-052012-07-05May 20122012-07-05T19:07:39Zthesisapplication/pdfhttp://hdl.handle.net/2152/ETD-UT-2012-05-51412152/ETD-UT-2012-05-5141eng |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Energy efficiency Multi-threading Register file organization Throughput computing |
spellingShingle |
Energy efficiency Multi-threading Register file organization Throughput computing Gebhart, Mark Alan Energy-efficient mechanisms for managing on-chip storage in throughput processors |
description |
Modern computer systems are power or energy limited. While the number of transistors per chip continues to increase, classic Dennard voltage scaling has come to an end. Therefore, architects must improve a design's energy efficiency to continue to increase performance at historical rates, while staying within a system's power limit. Throughput processors, which use a large number of threads to tolerate
memory latency, have emerged as an energy-efficient platform for
achieving high performance on diverse workloads and are found in
systems ranging from cell phones to supercomputers. This work focuses
on graphics processing units (GPUs), which contain thousands of
threads per chip.
In this dissertation, I redesign the on-chip storage system of a
modern GPU to improve energy efficiency. Modern GPUs contain very large register files that consume between 15%-20% of the
processor's dynamic energy. Most values written into the register
file are only read a single time, often within a few instructions of
being produced. To optimize for these patterns, we explore various
designs for register file hierarchies. We study both a
hardware-managed register file cache and a software-managed operand register file. We evaluate the energy tradeoffs in varying the number of levels and the capacity of each level in the hierarchy. Our most efficient design reduces register file energy by 54%.
Beyond the register file, GPUs also contain on-chip scratchpad
memories and caches. Traditional systems have a fixed partitioning
between these three structures. Applications have diverse
requirements and often a single resource is most critical to
performance. We propose to unify the register file, primary data
cache, and scratchpad memory into a single structure that is
dynamically partitioned on a per-kernel basis to match the
application's needs.
The techniques proposed in this dissertation improve the utilization of on-chip memory, a scarce resource for systems with a large number of hardware threads. Making more efficient use of on-chip memory both improves performance and reduces energy. Future efficient systems will be achieved by the combination of several such techniques which
improve energy efficiency. === text |
author |
Gebhart, Mark Alan |
author_facet |
Gebhart, Mark Alan |
author_sort |
Gebhart, Mark Alan |
title |
Energy-efficient mechanisms for managing on-chip storage in throughput processors |
title_short |
Energy-efficient mechanisms for managing on-chip storage in throughput processors |
title_full |
Energy-efficient mechanisms for managing on-chip storage in throughput processors |
title_fullStr |
Energy-efficient mechanisms for managing on-chip storage in throughput processors |
title_full_unstemmed |
Energy-efficient mechanisms for managing on-chip storage in throughput processors |
title_sort |
energy-efficient mechanisms for managing on-chip storage in throughput processors |
publishDate |
2012 |
url |
http://hdl.handle.net/2152/ETD-UT-2012-05-5141 |
work_keys_str_mv |
AT gebhartmarkalan energyefficientmechanismsformanagingonchipstorageinthroughputprocessors |
_version_ |
1716822479761047552 |