TIME PREDICTABILITY OF GPU KERNEL ON AN HSA COMPLIANT PLATFORM

During recent years, the importance of utilizing more computational power in smaller computersystems has increased. The utilization of more computational power in smaller packages, the abil-ity to combine more than one type of processor unit has become more popular in the industry. By combining, one...

Full description

Bibliographic Details
Main Authors: Tsog, Nandinbaatar, Larsson, Marcus
Format: Others
Language:English
Published: Mälardalens högskola, Inbyggda system 2016
Subjects:
HSA
GPU
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-31941
Description
Summary:During recent years, the importance of utilizing more computational power in smaller computersystems has increased. The utilization of more computational power in smaller packages, the abil-ity to combine more than one type of processor unit has become more popular in the industry. By combining, one achieves more power efficiency as well as gain more computational power insmaller area. However, heterogeneous programming has proved to be difficult, and that makes soft-ware developers diverge from learning heterogeneous programming languages. This has motivatedHSA foundation to develop a new hardware architecture, called Heterogeneous System Architecture(HSA). This architecture brings features that make the process of heterogeneous programming de-velopment more accessible, efficient, and easier to the software developers. The purpose of thisthesis is to investigate this new architecture, to learn and observe the timing characteristics of atask running a parallel region (a kernel) on a GPU in an HSA compliant system. With an objectiveto gain more knowledge, four test cases have been developed to collect time data and to analyzethe time of the code executed on the GPU. These are: comparison between CPU and GPU, tim-ing predictability of parallel periodic tasks, schedulability in HSA, and memory copy. Based onthe results of the analysis, it has been concluded that the HSA has potential to be very attractivefor developing heterogeneous programs due to its more streamlined infrastructure. It is easier toadapt, requires less knowledge regarding the underlying hardware, and the software developers canuse their preferred programming languages, instead of learning new programming framework, suchas OpenCL. However, since the architecture is new, there are bugs and HSA features that are yetto be incorporated into the drivers. Performance wise, HSA is faster compared to legacy methods,but lacks in providing consistent time predictability, which is important for real-time systems.