A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing

碩士 === 國立中山大學 === 資訊工程學系研究所 === 102 === This thesis designed an instruction-level-parallelism processor for the embedded system with general purpose computations. The hardware of the embedded system is small-scalar then currently popular CPU or GPU. We exploit some techniques to enhance the instruct...

Full description

Bibliographic Details
Main Authors: Rou-Jia Chen, 陳柔佳
Other Authors: Steve W. Haga
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/61446172694694401702
id ndltd-TW-102NSYS5392024
record_format oai_dc
spelling ndltd-TW-102NSYS53920242017-04-23T04:27:01Z http://ndltd.ncl.edu.tw/handle/61446172694694401702 A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing 在SIMD處理器上透過包裝暫存器的方法研究可平行化特性上的限制 Rou-Jia Chen 陳柔佳 碩士 國立中山大學 資訊工程學系研究所 102 This thesis designed an instruction-level-parallelism processor for the embedded system with general purpose computations. The hardware of the embedded system is small-scalar then currently popular CPU or GPU. We exploit some techniques to enhance the instruction scheduling time of our SIMD processor. By applying branch-and-bound ways to modify algorithm that maintain optimality includes PRSR (pseudo random shift register), memorization, and register grouping. And we also support heuristic ways that is a mental shortcut that allow us to solve exhaustive searching quickly and efficiently such as unrolling optimization, instruction distribution, and sign constraint. Through register packing and loop unrolling, we applied our SIMD processor on Mibench and have a compatible performance with VLIW processor; moreover, our register packing allows for a vector-wide load from the SRAM. Such a load is a natural fit to a SIMD and achieves significant speedups, when our allocator is used. Steve W. Haga 希家史提夫 2014 學位論文 ; thesis 92 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中山大學 === 資訊工程學系研究所 === 102 === This thesis designed an instruction-level-parallelism processor for the embedded system with general purpose computations. The hardware of the embedded system is small-scalar then currently popular CPU or GPU. We exploit some techniques to enhance the instruction scheduling time of our SIMD processor. By applying branch-and-bound ways to modify algorithm that maintain optimality includes PRSR (pseudo random shift register), memorization, and register grouping. And we also support heuristic ways that is a mental shortcut that allow us to solve exhaustive searching quickly and efficiently such as unrolling optimization, instruction distribution, and sign constraint. Through register packing and loop unrolling, we applied our SIMD processor on Mibench and have a compatible performance with VLIW processor; moreover, our register packing allows for a vector-wide load from the SRAM. Such a load is a natural fit to a SIMD and achieves significant speedups, when our allocator is used.
author2 Steve W. Haga
author_facet Steve W. Haga
Rou-Jia Chen
陳柔佳
author Rou-Jia Chen
陳柔佳
spellingShingle Rou-Jia Chen
陳柔佳
A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
author_sort Rou-Jia Chen
title A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_short A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_full A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_fullStr A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_full_unstemmed A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_sort study of the limits of parallelism available in simd processors through register packing
publishDate 2014
url http://ndltd.ncl.edu.tw/handle/61446172694694401702
work_keys_str_mv AT roujiachen astudyofthelimitsofparallelismavailableinsimdprocessorsthroughregisterpacking
AT chénróujiā astudyofthelimitsofparallelismavailableinsimdprocessorsthroughregisterpacking
AT roujiachen zàisimdchùlǐqìshàngtòuguòbāozhuāngzàncúnqìdefāngfǎyánjiūkěpíngxínghuàtèxìngshàngdexiànzhì
AT chénróujiā zàisimdchùlǐqìshàngtòuguòbāozhuāngzàncúnqìdefāngfǎyánjiūkěpíngxínghuàtèxìngshàngdexiànzhì
AT roujiachen studyofthelimitsofparallelismavailableinsimdprocessorsthroughregisterpacking
AT chénróujiā studyofthelimitsofparallelismavailableinsimdprocessorsthroughregisterpacking
_version_ 1718443173954453504