Managing resources for high performance and low energy in general-purpose processors

Microarchitectural techniques, such as superscalar instruction issue, Out-Of-Order instruction execution (OOO), Simultaneous Multi-Threading (SMT) and Chip Multi-Processing (CMP), improve processor performance dramatically. However, as processor design becomes more and more complicated, how to manag...

Full description

Bibliographic Details
Main Author: Wang, Huaping
Language:ENG
Published: ScholarWorks@UMass Amherst 2010
Subjects:
Online Access:https://scholarworks.umass.edu/dissertations/AAI3409856
id ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-6063
record_format oai_dc
spelling ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-60632020-12-02T14:37:51Z Managing resources for high performance and low energy in general-purpose processors Wang, Huaping Microarchitectural techniques, such as superscalar instruction issue, Out-Of-Order instruction execution (OOO), Simultaneous Multi-Threading (SMT) and Chip Multi-Processing (CMP), improve processor performance dramatically. However, as processor design becomes more and more complicated, how to manage the abundant processor resources to achieve optimal performance and power consumption of processors becomes increasingly more sophisticated. This dissertation investigates resource usage controlling techniques for general-purpose microprocessors (supporting both single hardware context and multiple hardware contexts) targeting both energy and performance. We address the power-inefficient resource usage issue in single-context processors and propose a Compiler-based Adaptive Fetch Throttling (CAFT) technique which combines the benefits of a hardware-based runtime throttling technique and a software-based static throttling technique providing good energy savings with a low performance loss. Our simulation results show that the proposed technique doubles the energy-delay product (EDP) savings compared to the fixed threshold throttling. We introduce the resource competing problem for SMT processors, which allow multiple threads to simultaneously share processor resources and improve the energy-efficiency indirectly by resource sharing. We present a novel Adaptive Resource Partitioning Algorithm (ARPA) to control the usage and sharing of processor resources in SMT processors. ARPA analyzes the resource usage efficiency of each thread in a time period and assigns more resources to threads which can use them in a more efficient way. Simulation results on a large set of 42 multiprogrammed workloads show that ARPA outperforms the currently best dynamic resource allocation technique, Hill-climbing, by 5.7% with regard to the overall instruction throughput. Considering fairness accorded to each thread, ARPA attains 9.2% improvements over Hill-climbing, using a commonly used fairness metric. We also propose resource adaptation approaches to adaptively control the number of powered-on ROB entries and partition shared resources among threads for both shared-ROB and divided-ROB structures, targeting both high performance and low energy. Our resource adaptation algorithms approaches consider not only the relative resource usage efficiency of each thread like ARPA, but also take into account the real resource usage of threads to identify cases of inefficient resource usage behavior and save energy. Our experimental results show that for an SMT processor with a shared-ROB structure, our resource adaptation approach achieves 16.7% energy savings over ARPA, while the performance loss is negligible across 42 sample workloads. For an SMT processor with a divided-ROB structure, our resource adaptation approach outperforms ARPA by 4.2% in addition to achieving 12.4% energy savings. 2010-01-01T08:00:00Z text https://scholarworks.umass.edu/dissertations/AAI3409856 Doctoral Dissertations Available from Proquest ENG ScholarWorks@UMass Amherst Computer Engineering
collection NDLTD
language ENG
sources NDLTD
topic Computer Engineering
spellingShingle Computer Engineering
Wang, Huaping
Managing resources for high performance and low energy in general-purpose processors
description Microarchitectural techniques, such as superscalar instruction issue, Out-Of-Order instruction execution (OOO), Simultaneous Multi-Threading (SMT) and Chip Multi-Processing (CMP), improve processor performance dramatically. However, as processor design becomes more and more complicated, how to manage the abundant processor resources to achieve optimal performance and power consumption of processors becomes increasingly more sophisticated. This dissertation investigates resource usage controlling techniques for general-purpose microprocessors (supporting both single hardware context and multiple hardware contexts) targeting both energy and performance. We address the power-inefficient resource usage issue in single-context processors and propose a Compiler-based Adaptive Fetch Throttling (CAFT) technique which combines the benefits of a hardware-based runtime throttling technique and a software-based static throttling technique providing good energy savings with a low performance loss. Our simulation results show that the proposed technique doubles the energy-delay product (EDP) savings compared to the fixed threshold throttling. We introduce the resource competing problem for SMT processors, which allow multiple threads to simultaneously share processor resources and improve the energy-efficiency indirectly by resource sharing. We present a novel Adaptive Resource Partitioning Algorithm (ARPA) to control the usage and sharing of processor resources in SMT processors. ARPA analyzes the resource usage efficiency of each thread in a time period and assigns more resources to threads which can use them in a more efficient way. Simulation results on a large set of 42 multiprogrammed workloads show that ARPA outperforms the currently best dynamic resource allocation technique, Hill-climbing, by 5.7% with regard to the overall instruction throughput. Considering fairness accorded to each thread, ARPA attains 9.2% improvements over Hill-climbing, using a commonly used fairness metric. We also propose resource adaptation approaches to adaptively control the number of powered-on ROB entries and partition shared resources among threads for both shared-ROB and divided-ROB structures, targeting both high performance and low energy. Our resource adaptation algorithms approaches consider not only the relative resource usage efficiency of each thread like ARPA, but also take into account the real resource usage of threads to identify cases of inefficient resource usage behavior and save energy. Our experimental results show that for an SMT processor with a shared-ROB structure, our resource adaptation approach achieves 16.7% energy savings over ARPA, while the performance loss is negligible across 42 sample workloads. For an SMT processor with a divided-ROB structure, our resource adaptation approach outperforms ARPA by 4.2% in addition to achieving 12.4% energy savings.
author Wang, Huaping
author_facet Wang, Huaping
author_sort Wang, Huaping
title Managing resources for high performance and low energy in general-purpose processors
title_short Managing resources for high performance and low energy in general-purpose processors
title_full Managing resources for high performance and low energy in general-purpose processors
title_fullStr Managing resources for high performance and low energy in general-purpose processors
title_full_unstemmed Managing resources for high performance and low energy in general-purpose processors
title_sort managing resources for high performance and low energy in general-purpose processors
publisher ScholarWorks@UMass Amherst
publishDate 2010
url https://scholarworks.umass.edu/dissertations/AAI3409856
work_keys_str_mv AT wanghuaping managingresourcesforhighperformanceandlowenergyingeneralpurposeprocessors
_version_ 1719365648174284800