An LLVM-based Binary Translator For A Heterogeneous System Architecture Simulator
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 101 === General purpose graphical processing unit (GPGPU) computation can speed up the programs with high degree of parallelism in a more power efficient way. However, the programming model is not programmer friendly. The memory model is heterogeneous thus such progr...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2013
|
Online Access: | http://ndltd.ncl.edu.tw/handle/11804950554889576672 |
Summary: | 碩士 === 國立交通大學 === 資訊科學與工程研究所 === 101 === General purpose graphical processing unit (GPGPU) computation can speed up the programs with high degree of parallelism in a more power efficient way. However, the programming model is not programmer friendly. The memory model is heterogeneous thus such programming needs explicit data transfer control between system main memory and the GPU device memory from the programmers. On the other hand, other infrastructures such as the debugging and the code distribution are lack of support as well. The Heterogeneous System Architecture (HSA) from AMD rises with such issues to ease the software development in the GPGPU programming. Features including the shared memory model and the re-targetable intermediate representation (IR) with more specific operation controlling such as the cross work group controlling ease the software development in the GPGPU environment. In this paper, we present the HSA Translator for the fast simulation of the HSAIL in the functional level system mode simulator called the HSA Simulator performing the simulation of the HSA environment. It consists of the simulator based on the PQEMU for the simulation of the processing unit in the GPGPU environment. The HSA Translator is implemented in the simulator for the native code translation. The HSA Translator leverages the LLVM infrastructure to translate the kernel source code from the Heterogeneous System Architecture Intermediate Language (HSAIL) to the native re-locatable code. The linking of the native binary is done by a self-implemented link-loader called the HSA Link-Loader implemented in the simulator. The simulation of the kernel processing device is performed by using the host threads in order to speed up the simulation. We evaluate the simulation with the self-translated HSAIL benchmark based on the Rodinia benchmark and the AMD OpenCL samples.
|
---|