A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing

碩士 === 國立中山大學 === 資訊工程學系研究所 === 102 === This thesis designed an instruction-level-parallelism processor for the embedded system with general purpose computations. The hardware of the embedded system is small-scalar then currently popular CPU or GPU. We exploit some techniques to enhance the instruct...

Full description

Bibliographic Details
Main Authors:	Rou-Jia Chen, 陳柔佳
Other Authors:	Steve W. Haga
Format:	Others
Language:	en_US
Published:	2014
Online Access:	http://ndltd.ncl.edu.tw/handle/61446172694694401702

id	ndltd-TW-102NSYS5392024
record_format	oai_dc
spelling	ndltd-TW-102NSYS53920242017-04-23T04:27:01Z http://ndltd.ncl.edu.tw/handle/61446172694694401702 A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing 在SIMD處理器上透過包裝暫存器的方法研究可平行化特性上的限制 Rou-Jia Chen 陳柔佳碩士國立中山大學資訊工程學系研究所 102 This thesis designed an instruction-level-parallelism processor for the embedded system with general purpose computations. The hardware of the embedded system is small-scalar then currently popular CPU or GPU. We exploit some techniques to enhance the instruction scheduling time of our SIMD processor. By applying branch-and-bound ways to modify algorithm that maintain optimality includes PRSR (pseudo random shift register), memorization, and register grouping. And we also support heuristic ways that is a mental shortcut that allow us to solve exhaustive searching quickly and efficiently such as unrolling optimization, instruction distribution, and sign constraint. Through register packing and loop unrolling, we applied our SIMD processor on Mibench and have a compatible performance with VLIW processor; moreover, our register packing allows for a vector-wide load from the SRAM. Such a load is a natural fit to a SIMD and achieves significant speedups, when our allocator is used. Steve W. Haga 希家史提夫 2014 學位論文 ; thesis 92 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立中山大學 === 資訊工程學系研究所 === 102 === This thesis designed an instruction-level-parallelism processor for the embedded system with general purpose computations. The hardware of the embedded system is small-scalar then currently popular CPU or GPU. We exploit some techniques to enhance the instruction scheduling time of our SIMD processor. By applying branch-and-bound ways to modify algorithm that maintain optimality includes PRSR (pseudo random shift register), memorization, and register grouping. And we also support heuristic ways that is a mental shortcut that allow us to solve exhaustive searching quickly and efficiently such as unrolling optimization, instruction distribution, and sign constraint. Through register packing and loop unrolling, we applied our SIMD processor on Mibench and have a compatible performance with VLIW processor; moreover, our register packing allows for a vector-wide load from the SRAM. Such a load is a natural fit to a SIMD and achieves significant speedups, when our allocator is used.
author2	Steve W. Haga
author_facet	Steve W. Haga Rou-Jia Chen 陳柔佳
author	Rou-Jia Chen 陳柔佳
spellingShingle	Rou-Jia Chen 陳柔佳 A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
author_sort	Rou-Jia Chen
title	A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_short	A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_full	A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_fullStr	A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_full_unstemmed	A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing
title_sort	study of the limits of parallelism available in simd processors through register packing
publishDate	2014
url	http://ndltd.ncl.edu.tw/handle/61446172694694401702
work_keys_str_mv	AT roujiachen astudyofthelimitsofparallelismavailableinsimdprocessorsthroughregisterpacking AT chénróujiā astudyofthelimitsofparallelismavailableinsimdprocessorsthroughregisterpacking AT roujiachen zàisimdchùlǐqìshàngtòuguòbāozhuāngzàncúnqìdefāngfǎyánjiūkěpíngxínghuàtèxìngshàngdexiànzhì AT chénróujiā zàisimdchùlǐqìshàngtòuguòbāozhuāngzàncúnqìdefāngfǎyánjiūkěpíngxínghuàtèxìngshàngdexiànzhì AT roujiachen studyofthelimitsofparallelismavailableinsimdprocessorsthroughregisterpacking AT chénróujiā studyofthelimitsofparallelismavailableinsimdprocessorsthroughregisterpacking
_version_	1718443173954453504

A Study of the Limits of Parallelism Available in SIMD Processors Through Register Packing

Similar Items