A Parallel Performance Analysis Framework for Cache Coherence Protocols
碩士 === 國立臺灣大學 === 資訊工程學研究所 === 100 === Multi-core platform offer large performance potential for parallel software, but developing these softwares is very challanging. The performance of cache coherence protocol due to the data sharing in multi-threaded applications plays the important role that imp...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2012
|
Online Access: | http://ndltd.ncl.edu.tw/handle/25583677656905128183 |
id |
ndltd-TW-100NTU05392096 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-100NTU053920962015-10-13T21:50:18Z http://ndltd.ncl.edu.tw/handle/25583677656905128183 A Parallel Performance Analysis Framework for Cache Coherence Protocols 快取記憶體一致性協定之平行化效能分析工具 Hui-Hsin Hsu 許匯鑫 碩士 國立臺灣大學 資訊工程學研究所 100 Multi-core platform offer large performance potential for parallel software, but developing these softwares is very challanging. The performance of cache coherence protocol due to the data sharing in multi-threaded applications plays the important role that impacts the scalability. To analyze the cache performance in multi-core system, detail simulation can give the accurate results but it is too slow for complex systems since it serialized the simulation of many cores and the performance is bounded by the computation power of single core. In this thesis, we propose a novel multi-core cache performance analysis approach that combine the simualtion and analytic method fast performance estimation in parallel. The experimental results show that our approach performs about 13 times faster that the memory-access-based approach. We further integrate this parallel scheme into a parallel full-system emulator for system wide performance analysis but not only the user space applicaitons. To demonstrate the performance analysis framework, we show a case study that optimize a OpenMP program, the maximum performance improvement of the application is up to about 100\% under the configuration of using 16 OpenMP threads on our 48-cores host machine. Shih-Hao Hung 洪士灝 2012 學位論文 ; thesis 37 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 資訊工程學研究所 === 100 === Multi-core platform offer large performance potential for parallel software, but developing these softwares is very challanging. The performance of cache coherence protocol due to the data sharing in multi-threaded applications plays the important role that impacts the scalability. To analyze the cache performance in multi-core system, detail simulation can give the accurate results but it is too slow for complex systems since it serialized the simulation of many cores and the performance is bounded by the computation power of single core.
In this thesis, we propose a novel multi-core cache performance analysis approach that combine the simualtion and analytic method fast performance estimation in parallel. The experimental results show that our approach performs about 13 times faster that the memory-access-based approach. We further integrate this parallel scheme into a parallel full-system emulator for system wide performance analysis but not only the user space applicaitons.
To demonstrate the performance analysis framework, we show a case study that optimize a OpenMP program, the maximum performance improvement of the application is up to about 100\% under the configuration of using 16 OpenMP threads on our 48-cores host machine.
|
author2 |
Shih-Hao Hung |
author_facet |
Shih-Hao Hung Hui-Hsin Hsu 許匯鑫 |
author |
Hui-Hsin Hsu 許匯鑫 |
spellingShingle |
Hui-Hsin Hsu 許匯鑫 A Parallel Performance Analysis Framework for Cache Coherence Protocols |
author_sort |
Hui-Hsin Hsu |
title |
A Parallel Performance Analysis Framework for Cache Coherence Protocols |
title_short |
A Parallel Performance Analysis Framework for Cache Coherence Protocols |
title_full |
A Parallel Performance Analysis Framework for Cache Coherence Protocols |
title_fullStr |
A Parallel Performance Analysis Framework for Cache Coherence Protocols |
title_full_unstemmed |
A Parallel Performance Analysis Framework for Cache Coherence Protocols |
title_sort |
parallel performance analysis framework for cache coherence protocols |
publishDate |
2012 |
url |
http://ndltd.ncl.edu.tw/handle/25583677656905128183 |
work_keys_str_mv |
AT huihsinhsu aparallelperformanceanalysisframeworkforcachecoherenceprotocols AT xǔhuìxīn aparallelperformanceanalysisframeworkforcachecoherenceprotocols AT huihsinhsu kuàiqǔjìyìtǐyīzhìxìngxiédìngzhīpíngxínghuàxiàonéngfēnxīgōngjù AT xǔhuìxīn kuàiqǔjìyìtǐyīzhìxìngxiédìngzhīpíngxínghuàxiàonéngfēnxīgōngjù AT huihsinhsu parallelperformanceanalysisframeworkforcachecoherenceprotocols AT xǔhuìxīn parallelperformanceanalysisframeworkforcachecoherenceprotocols |
_version_ |
1718068887990304768 |