Exploiting X86 Front-end Parallelism with Program Trace Support

博士 === 國立交通大學 === 資訊工程系 === 90 === The front-end units, the instruction stream buffer and the fetcher, are the key elements for achieving high instruction bandwidth. However, in x86 superscalar processors, the variable-length instructions and the complex addressing system make fetching mu...

Full description

Bibliographic Details
Main Authors:	Jih-ching Chiu, 邱日清
Other Authors:	Chung-Ping Chung
Format:	Others
Language:	zh-TW
Published:	2002
Online Access:	http://ndltd.ncl.edu.tw/handle/29519259549200384120

id	ndltd-TW-090NCTU0392007
record_format	oai_dc
spelling	ndltd-TW-090NCTU03920072016-06-27T16:08:59Z http://ndltd.ncl.edu.tw/handle/29519259549200384120 Exploiting X86 Front-end Parallelism with Program Trace Support 以程式軌跡支援開發X86指令集處理器前端並行方式 Jih-ching Chiu 邱日清博士國立交通大學資訊工程系 90 The front-end units, the instruction stream buffer and the fetcher, are the key elements for achieving high instruction bandwidth. However, in x86 superscalar processors, the variable-length instructions and the complex addressing system make fetching multiple instructions in a cycle difficult. To approach high instruction fetch bandwidth, keeping the streaming smooth and expanding the x86 instruction fetch degree are deeply considered with the relations of the program-execution trace. To build a high superscalar degree front-end to achieve this goal, four topics are studied in this dissertation: 1. Increasing fetch bandwidth at the front-end entrant; 2. Identifying multiple instructions in one clock cycle; 3. Fetching super basic block instructions; 4. Storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors. In the first topic, increasing fetch bandwidth at the front-end entrant, we develop a new instruction prefetching method in which the prefetch is directed by the prediction on branches, called the branch instruction based (BIB) prefetching. Simulation results show that this design outperforms the traditional sequential prefetching by 7% and other prediction table based prefetching methods by 17% on average with the same BTB size. In the second topic, identifying multiple instructions in one clock cycle, we propose to use Instruction Identifier to predict instruction lengths and store the instruction pointers as superscalar instruction group indicators. Simulation results suggest that the Instruction Identifier with a 64-entry table is a good performance/cost choice. In the third topic, fetching super basic block instructions, we propose a design to improve instruction stream buffer performance by coupling it with the Branch Target Buffer (BTB) to support trace prediction. Compared with other existing designs, this instruction stream buffer can improve performance by 90% over current x86 processor instruction fetch rate on average. In the fourth topic, storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors, we propose an instruction PC Offset Queue. Two CISC hazards in the x86 architectures have been considered in this design, which reduce by 1/3 the storage space for a degree-5 superscalar x86 processor with even smaller access latency. Having dealt with the critical topics discussed in this dissertation, an efficient front-end of a high superscalar degree x86 micro-architecture becomes practical. Chung-Ping Chung 鍾崇斌 2002 學位論文 ; thesis 140 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	博士 === 國立交通大學 === 資訊工程系 === 90 === The front-end units, the instruction stream buffer and the fetcher, are the key elements for achieving high instruction bandwidth. However, in x86 superscalar processors, the variable-length instructions and the complex addressing system make fetching multiple instructions in a cycle difficult. To approach high instruction fetch bandwidth, keeping the streaming smooth and expanding the x86 instruction fetch degree are deeply considered with the relations of the program-execution trace. To build a high superscalar degree front-end to achieve this goal, four topics are studied in this dissertation: 1. Increasing fetch bandwidth at the front-end entrant; 2. Identifying multiple instructions in one clock cycle; 3. Fetching super basic block instructions; 4. Storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors. In the first topic, increasing fetch bandwidth at the front-end entrant, we develop a new instruction prefetching method in which the prefetch is directed by the prediction on branches, called the branch instruction based (BIB) prefetching. Simulation results show that this design outperforms the traditional sequential prefetching by 7% and other prediction table based prefetching methods by 17% on average with the same BTB size. In the second topic, identifying multiple instructions in one clock cycle, we propose to use Instruction Identifier to predict instruction lengths and store the instruction pointers as superscalar instruction group indicators. Simulation results suggest that the Instruction Identifier with a 64-entry table is a good performance/cost choice. In the third topic, fetching super basic block instructions, we propose a design to improve instruction stream buffer performance by coupling it with the Branch Target Buffer (BTB) to support trace prediction. Compared with other existing designs, this instruction stream buffer can improve performance by 90% over current x86 processor instruction fetch rate on average. In the fourth topic, storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors, we propose an instruction PC Offset Queue. Two CISC hazards in the x86 architectures have been considered in this design, which reduce by 1/3 the storage space for a degree-5 superscalar x86 processor with even smaller access latency. Having dealt with the critical topics discussed in this dissertation, an efficient front-end of a high superscalar degree x86 micro-architecture becomes practical.
author2	Chung-Ping Chung
author_facet	Chung-Ping Chung Jih-ching Chiu 邱日清
author	Jih-ching Chiu 邱日清
spellingShingle	Jih-ching Chiu 邱日清 Exploiting X86 Front-end Parallelism with Program Trace Support
author_sort	Jih-ching Chiu
title	Exploiting X86 Front-end Parallelism with Program Trace Support
title_short	Exploiting X86 Front-end Parallelism with Program Trace Support
title_full	Exploiting X86 Front-end Parallelism with Program Trace Support
title_fullStr	Exploiting X86 Front-end Parallelism with Program Trace Support
title_full_unstemmed	Exploiting X86 Front-end Parallelism with Program Trace Support
title_sort	exploiting x86 front-end parallelism with program trace support
publishDate	2002
url	http://ndltd.ncl.edu.tw/handle/29519259549200384120
work_keys_str_mv	AT jihchingchiu exploitingx86frontendparallelismwithprogramtracesupport AT qiūrìqīng exploitingx86frontendparallelismwithprogramtracesupport AT jihchingchiu yǐchéngshìguǐjīzhīyuánkāifāx86zhǐlìngjíchùlǐqìqiánduānbìngxíngfāngshì AT qiūrìqīng yǐchéngshìguǐjīzhīyuánkāifāx86zhǐlìngjíchùlǐqìqiánduānbìngxíngfāngshì
_version_	1718324428425658368

Exploiting X86 Front-end Parallelism with Program Trace Support

Similar Items