Scheduling Algorithms of Co-optimizing Thread-Level- Parallelism and Cache Utilization for GPGPUs 研
碩士 === 國立交通大學 === 電子工程學系 電子研究所 === 102 === Thread-Level-Parallelism (TLP) and cache utilization are two significant performance factors of modern throughput processors. The conflicting correlation between the two factors has made the design a non-trivial task. Increasing TLP would aggravate cache co...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2014
|
Online Access: | http://ndltd.ncl.edu.tw/handle/99321023691038445807 |
id |
ndltd-TW-102NCTU5428111 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-102NCTU54281112015-10-14T00:18:21Z http://ndltd.ncl.edu.tw/handle/99321023691038445807 Scheduling Algorithms of Co-optimizing Thread-Level- Parallelism and Cache Utilization for GPGPUs 研 考慮執行緒平行度且快取記憶體資源並應用於通用 圖形處理器之執行緒排程演算法 Lu, Chin-Fu 呂勁甫 碩士 國立交通大學 電子工程學系 電子研究所 102 Thread-Level-Parallelism (TLP) and cache utilization are two significant performance factors of modern throughput processors. The conflicting correlation between the two factors has made the design a non-trivial task. Increasing TLP would aggravate cache contention, while avoiding cache contention could limit the TLP. The trade-off becomes even more intrigue and sensitive when dealing with applications with irregular data access patterns. Many existing thread scheduling algorithms addresses only one of these factors at a time. This thesis has demonstrated that there exists a significant performance gain when the two factors are considered together and properly traded-off. To conduct a comprehensive analysis for the performance impact of the two factors, this thesis formulates two thread scheduling problem to characterize the design concerns. A series of solutions are integrated to resolve the scheduling on a set of applications with irregular memory accesses. The experiment results on NVIDIA’s Fermi architecture have shown the performance difference of the proposed thread scheduling addressing various combination of constrains. Compare to a widely-used thread scheduling schemes, the average improvement on execution time can reach up to 51%. Jou, Jing-Yang Lai, Bo-Cheng 周景揚 賴伯承 2014 學位論文 ; thesis 36 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 電子工程學系 電子研究所 === 102 === Thread-Level-Parallelism (TLP) and cache utilization are two significant performance factors of modern throughput processors. The conflicting correlation between the two factors has made the design a non-trivial task. Increasing TLP would aggravate cache contention, while avoiding cache contention could limit the TLP. The trade-off becomes even more intrigue and sensitive when dealing with applications with irregular data access patterns. Many existing thread scheduling algorithms addresses only one of these factors at a time. This thesis has demonstrated that there exists a significant performance gain when the two factors are considered together and properly traded-off. To conduct a comprehensive analysis for the performance impact of the two factors, this thesis formulates two thread scheduling problem to characterize the design concerns. A series of solutions are integrated to resolve the scheduling on a set of applications with irregular memory accesses. The experiment results on NVIDIA’s Fermi architecture have shown the performance difference of the proposed thread scheduling addressing various combination of constrains. Compare to a widely-used thread scheduling schemes, the average improvement on execution time can reach up to 51%.
|
author2 |
Jou, Jing-Yang |
author_facet |
Jou, Jing-Yang Lu, Chin-Fu 呂勁甫 |
author |
Lu, Chin-Fu 呂勁甫 |
spellingShingle |
Lu, Chin-Fu 呂勁甫 Scheduling Algorithms of Co-optimizing Thread-Level- Parallelism and Cache Utilization for GPGPUs 研 |
author_sort |
Lu, Chin-Fu |
title |
Scheduling Algorithms of Co-optimizing Thread-Level- Parallelism and Cache Utilization for GPGPUs 研 |
title_short |
Scheduling Algorithms of Co-optimizing Thread-Level- Parallelism and Cache Utilization for GPGPUs 研 |
title_full |
Scheduling Algorithms of Co-optimizing Thread-Level- Parallelism and Cache Utilization for GPGPUs 研 |
title_fullStr |
Scheduling Algorithms of Co-optimizing Thread-Level- Parallelism and Cache Utilization for GPGPUs 研 |
title_full_unstemmed |
Scheduling Algorithms of Co-optimizing Thread-Level- Parallelism and Cache Utilization for GPGPUs 研 |
title_sort |
scheduling algorithms of co-optimizing thread-level- parallelism and cache utilization for gpgpus 研 |
publishDate |
2014 |
url |
http://ndltd.ncl.edu.tw/handle/99321023691038445807 |
work_keys_str_mv |
AT luchinfu schedulingalgorithmsofcooptimizingthreadlevelparallelismandcacheutilizationforgpgpusyán AT lǚjìnfǔ schedulingalgorithmsofcooptimizingthreadlevelparallelismandcacheutilizationforgpgpusyán AT luchinfu kǎolǜzhíxíngxùpíngxíngdùqiěkuàiqǔjìyìtǐzīyuánbìngyīngyòngyútōngyòngtúxíngchùlǐqìzhīzhíxíngxùpáichéngyǎnsuànfǎ AT lǚjìnfǔ kǎolǜzhíxíngxùpíngxíngdùqiěkuàiqǔjìyìtǐzīyuánbìngyīngyòngyútōngyòngtúxíngchùlǐqìzhīzhíxíngxùpáichéngyǎnsuànfǎ |
_version_ |
1718088741397987328 |