Design of the Optimized Group Management Unit by Detecting Thread Parallelism on the Hyperscalar Architecture

碩士 === 國立中山大學 === 電機工程學系研究所 === 101 === Current trends in processor design have migrated toward chip multiprocessors (CMPs). CMPs are designed to exploit both instruction-level parallelism (ILP) within processors and thread-level parallelism (TLP) within and across processors. However, the conventio...

Full description

Bibliographic Details
Main Authors: Yin-jou Huang, 黃尹柔
Other Authors: Jih-ching Chiu
Format: Others
Language:zh-TW
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/64535275185786278655
Description
Summary:碩士 === 國立中山大學 === 電機工程學系研究所 === 101 === Current trends in processor design have migrated toward chip multiprocessors (CMPs). CMPs are designed to exploit both instruction-level parallelism (ILP) within processors and thread-level parallelism (TLP) within and across processors. However, the conventional design of current CMPs is forced to make a choice between high single-thread performance and high peak throughput. This inability to adjust to varying levels of ILP and TLP results in processor inefficiency. Therefore, this paper is based on the hyperscalar architecture which is a chip multiprocessor. The hyperscalar concept enables the multi-core architectures to dynamically group many scalar in-order cores as a superscalar processor to accelerate a sequential thread. The reconfigure feature of hyperscalar architecture contributes to the high flexibility in adapting different types of applications, providing high single-thread performance when thread level parallelism (TLP) is low and high throughput when TLP is high. In order to increase the efficient of the processors, the system will dynamically detect the ILP of the thread. And according to the difference of the ILP, it will group or release the processors. Based on the hyperscalar architecture, this thesis adds the mechanism which can detect the ILP of thread. And the two new instructions CRM (Core Register Move) and RelC (Release Core) can release the processors of the group. To ensure the data accuracy within the group after release the core, CRM instruction move the information from the core which is released to the other core in this group; RelC instruction indicates to release the core. When this instruction executes in the WB stage, it will send a release signal to Group-Management-Unit (GMU) to notify the data has been completely transferred and the core is empty. After GMU dispatches these two instructions, the system will release or group the cores according to the ILP. Simulation results show that the proposed architecture can increase the use of the processors and improve the work efficiency.