Power, Performance, and Energy Management of Heterogeneous Architectures

abstract: Many core modern multiprocessor systems-on-chip offers tremendous power and performance optimization opportunities by tuning thousands of potential voltage, frequency and core configurations. Applications running on these architectures are becoming increasingly complex. As the basic bui...

Full description

Bibliographic Details
Other Authors: Patil, Chetan Arvind (Author)
Format: Dissertation
Language:English
Published: 2019
Subjects:
Online Access:http://hdl.handle.net/2286/R.I.55475
id ndltd-asu.edu-item-55475
record_format oai_dc
spelling ndltd-asu.edu-item-554752020-01-15T03:01:05Z Power, Performance, and Energy Management of Heterogeneous Architectures abstract: Many core modern multiprocessor systems-on-chip offers tremendous power and performance optimization opportunities by tuning thousands of potential voltage, frequency and core configurations. Applications running on these architectures are becoming increasingly complex. As the basic building blocks, which make up the application, change during runtime, different configurations may become optimal with respect to power, performance or other metrics. Identifying the optimal configuration at runtime is a daunting task due to a large number of workloads and configurations. Therefore, there is a strong need to evaluate the metrics of interest as a function of the supported configurations. This thesis focuses on two different types of modern multiprocessor systems-on-chip (SoC): Mobile heterogeneous systems and tile based Intel Xeon Phi architecture. For mobile heterogeneous systems, this thesis presents a novel methodology that can accurately instrument different types of applications with specific performance monitoring calls. These calls provide a rich set of performance statistics at a basic block level while the application runs on the target platform. The target architecture used for this work (Odroid XU3) is capable of running at 4940 different frequency and core combinations. With the help of instrumented application vast amount of characterization data is collected that provides details about performance, power and CPU state at every instrumented basic block across 19 different types of applications. The vast amount of data collected has enabled two runtime schemes. The first work provides a methodology to find optimal configurations in heterogeneous architecture using classifiers and demonstrates an average increase of 93%, 81% and 6% in performance per watt compared to the interactive, ondemand and powersave governors, respectively. The second work using same data shows a novel imitation learning framework for dynamically controlling the type, number, and the frequencies of active cores to achieve an average of 109% PPW improvement compared to the default governors. This work also presents how to accurately profile tile based Intel Xeon Phi architecture while training different types of neural networks using open image dataset on deep learning framework. The data collected allows deep exploratory analysis. It also showcases how different hardware parameters affect performance of Xeon Phi. Dissertation/Thesis Patil, Chetan Arvind (Author) Ogras, Umit Y (Advisor) Chakrabarti, Chaitali (Committee member) Shrivastava, Aviral (Committee member) Arizona State University (Publisher) Computer engineering Architectures Energy Management Heterogeneous Performance Management Power Management Xeon Phi eng 100 pages Masters Thesis Engineering 2019 Masters Thesis http://hdl.handle.net/2286/R.I.55475 http://rightsstatements.org/vocab/InC/1.0/ 2019
collection NDLTD
language English
format Dissertation
sources NDLTD
topic Computer engineering
Architectures
Energy Management
Heterogeneous
Performance Management
Power Management
Xeon Phi
spellingShingle Computer engineering
Architectures
Energy Management
Heterogeneous
Performance Management
Power Management
Xeon Phi
Power, Performance, and Energy Management of Heterogeneous Architectures
description abstract: Many core modern multiprocessor systems-on-chip offers tremendous power and performance optimization opportunities by tuning thousands of potential voltage, frequency and core configurations. Applications running on these architectures are becoming increasingly complex. As the basic building blocks, which make up the application, change during runtime, different configurations may become optimal with respect to power, performance or other metrics. Identifying the optimal configuration at runtime is a daunting task due to a large number of workloads and configurations. Therefore, there is a strong need to evaluate the metrics of interest as a function of the supported configurations. This thesis focuses on two different types of modern multiprocessor systems-on-chip (SoC): Mobile heterogeneous systems and tile based Intel Xeon Phi architecture. For mobile heterogeneous systems, this thesis presents a novel methodology that can accurately instrument different types of applications with specific performance monitoring calls. These calls provide a rich set of performance statistics at a basic block level while the application runs on the target platform. The target architecture used for this work (Odroid XU3) is capable of running at 4940 different frequency and core combinations. With the help of instrumented application vast amount of characterization data is collected that provides details about performance, power and CPU state at every instrumented basic block across 19 different types of applications. The vast amount of data collected has enabled two runtime schemes. The first work provides a methodology to find optimal configurations in heterogeneous architecture using classifiers and demonstrates an average increase of 93%, 81% and 6% in performance per watt compared to the interactive, ondemand and powersave governors, respectively. The second work using same data shows a novel imitation learning framework for dynamically controlling the type, number, and the frequencies of active cores to achieve an average of 109% PPW improvement compared to the default governors. This work also presents how to accurately profile tile based Intel Xeon Phi architecture while training different types of neural networks using open image dataset on deep learning framework. The data collected allows deep exploratory analysis. It also showcases how different hardware parameters affect performance of Xeon Phi. === Dissertation/Thesis === Masters Thesis Engineering 2019
author2 Patil, Chetan Arvind (Author)
author_facet Patil, Chetan Arvind (Author)
title Power, Performance, and Energy Management of Heterogeneous Architectures
title_short Power, Performance, and Energy Management of Heterogeneous Architectures
title_full Power, Performance, and Energy Management of Heterogeneous Architectures
title_fullStr Power, Performance, and Energy Management of Heterogeneous Architectures
title_full_unstemmed Power, Performance, and Energy Management of Heterogeneous Architectures
title_sort power, performance, and energy management of heterogeneous architectures
publishDate 2019
url http://hdl.handle.net/2286/R.I.55475
_version_ 1719308482027454464