Dynamic Binary Vectorization in Enhanced HQEMU
碩士 === 國立臺灣大學 === 資訊網路與多媒體研究所 === 107 === Auto vectorization techniques have been adopted by compilers to exploit data-level parallelism in parallel processing for decades. However, since processor architectures have kept enhancing with new features to improve vector/SIMD performance, legacy applica...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2019
|
Online Access: | http://ndltd.ncl.edu.tw/handle/9mwuvj |
id |
ndltd-TW-107NTU05641027 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-107NTU056410272019-11-16T05:28:00Z http://ndltd.ncl.edu.tw/handle/9mwuvj Dynamic Binary Vectorization in Enhanced HQEMU 基於進階HQEMU之動態二進制碼向量化 Chih-Min Lin 林致民 碩士 國立臺灣大學 資訊網路與多媒體研究所 107 Auto vectorization techniques have been adopted by compilers to exploit data-level parallelism in parallel processing for decades. However, since processor architectures have kept enhancing with new features to improve vector/SIMD performance, legacy application binaries failed to fully exploit new vector/SIMD capabilities in modern architectures. For example, legacy ARMv7 binaries cannot benefit from ARMv8 SIMD double precision capability, and legacy x86 binaries cannot enjoy the power of AVX-512 extensions. In this thesis, we study the fundamental issues involved in cross-ISA Dynamic Binary Translation (DBT) to convert non-vectorized loops to vector/SIMD forms to achieve greater computation throughput available in newer processor architectures. The key idea is to recover critical loop information from those application binaries in order to carry out vectorization at runtime. Experiment results show that our approach achieves an average speedup of 1.42x compared to ARMv7 native run across various benchmarks in an ARMv7-to-ARMv8 dynamic binary translation system. 徐慰中 2019 學位論文 ; thesis 36 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 資訊網路與多媒體研究所 === 107 === Auto vectorization techniques have been adopted by compilers to exploit data-level parallelism in parallel processing for decades. However, since processor architectures have kept enhancing with new features to improve vector/SIMD performance, legacy application binaries failed to fully exploit new vector/SIMD capabilities in modern architectures. For example, legacy ARMv7 binaries cannot benefit from ARMv8 SIMD double precision capability, and legacy x86 binaries cannot enjoy the power of AVX-512 extensions.
In this thesis, we study the fundamental issues involved in cross-ISA Dynamic Binary Translation (DBT) to convert non-vectorized loops to vector/SIMD forms to achieve greater computation throughput available in newer processor architectures. The key idea is to recover critical loop information from those application binaries in order to carry out vectorization at runtime. Experiment results show that our approach achieves an average speedup of 1.42x compared to ARMv7 native run across various benchmarks in an ARMv7-to-ARMv8 dynamic binary translation system.
|
author2 |
徐慰中 |
author_facet |
徐慰中 Chih-Min Lin 林致民 |
author |
Chih-Min Lin 林致民 |
spellingShingle |
Chih-Min Lin 林致民 Dynamic Binary Vectorization in Enhanced HQEMU |
author_sort |
Chih-Min Lin |
title |
Dynamic Binary Vectorization in Enhanced HQEMU |
title_short |
Dynamic Binary Vectorization in Enhanced HQEMU |
title_full |
Dynamic Binary Vectorization in Enhanced HQEMU |
title_fullStr |
Dynamic Binary Vectorization in Enhanced HQEMU |
title_full_unstemmed |
Dynamic Binary Vectorization in Enhanced HQEMU |
title_sort |
dynamic binary vectorization in enhanced hqemu |
publishDate |
2019 |
url |
http://ndltd.ncl.edu.tw/handle/9mwuvj |
work_keys_str_mv |
AT chihminlin dynamicbinaryvectorizationinenhancedhqemu AT línzhìmín dynamicbinaryvectorizationinenhancedhqemu AT chihminlin jīyújìnjiēhqemuzhīdòngtàièrjìnzhìmǎxiàngliànghuà AT línzhìmín jīyújìnjiēhqemuzhīdòngtàièrjìnzhìmǎxiàngliànghuà |
_version_ |
1719292828698279936 |