Exploiting X86 Front-end Parallelism with Program Trace Support

博士 === 國立交通大學 === 資訊工程系 === 90 === The front-end units, the instruction stream buffer and the fetcher, are the key elements for achieving high instruction bandwidth. However, in x86 superscalar processors, the variable-length instructions and the complex addressing system make fetching mu...

Full description

Bibliographic Details
Main Authors: Jih-ching Chiu, 邱日清
Other Authors: Chung-Ping Chung
Format: Others
Language:zh-TW
Published: 2002
Online Access:http://ndltd.ncl.edu.tw/handle/29519259549200384120
id ndltd-TW-090NCTU0392007
record_format oai_dc
spelling ndltd-TW-090NCTU03920072016-06-27T16:08:59Z http://ndltd.ncl.edu.tw/handle/29519259549200384120 Exploiting X86 Front-end Parallelism with Program Trace Support 以程式軌跡支援開發X86指令集處理器前端並行方式 Jih-ching Chiu 邱日清 博士 國立交通大學 資訊工程系 90 The front-end units, the instruction stream buffer and the fetcher, are the key elements for achieving high instruction bandwidth. However, in x86 superscalar processors, the variable-length instructions and the complex addressing system make fetching multiple instructions in a cycle difficult. To approach high instruction fetch bandwidth, keeping the streaming smooth and expanding the x86 instruction fetch degree are deeply considered with the relations of the program-execution trace. To build a high superscalar degree front-end to achieve this goal, four topics are studied in this dissertation: 1. Increasing fetch bandwidth at the front-end entrant; 2. Identifying multiple instructions in one clock cycle; 3. Fetching super basic block instructions; 4. Storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors. In the first topic, increasing fetch bandwidth at the front-end entrant, we develop a new instruction prefetching method in which the prefetch is directed by the prediction on branches, called the branch instruction based (BIB) prefetching. Simulation results show that this design outperforms the traditional sequential prefetching by 7% and other prediction table based prefetching methods by 17% on average with the same BTB size. In the second topic, identifying multiple instructions in one clock cycle, we propose to use Instruction Identifier to predict instruction lengths and store the instruction pointers as superscalar instruction group indicators. Simulation results suggest that the Instruction Identifier with a 64-entry table is a good performance/cost choice. In the third topic, fetching super basic block instructions, we propose a design to improve instruction stream buffer performance by coupling it with the Branch Target Buffer (BTB) to support trace prediction. Compared with other existing designs, this instruction stream buffer can improve performance by 90% over current x86 processor instruction fetch rate on average. In the fourth topic, storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors, we propose an instruction PC Offset Queue. Two CISC hazards in the x86 architectures have been considered in this design, which reduce by 1/3 the storage space for a degree-5 superscalar x86 processor with even smaller access latency. Having dealt with the critical topics discussed in this dissertation, an efficient front-end of a high superscalar degree x86 micro-architecture becomes practical. Chung-Ping Chung 鍾崇斌 2002 學位論文 ; thesis 140 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 博士 === 國立交通大學 === 資訊工程系 === 90 === The front-end units, the instruction stream buffer and the fetcher, are the key elements for achieving high instruction bandwidth. However, in x86 superscalar processors, the variable-length instructions and the complex addressing system make fetching multiple instructions in a cycle difficult. To approach high instruction fetch bandwidth, keeping the streaming smooth and expanding the x86 instruction fetch degree are deeply considered with the relations of the program-execution trace. To build a high superscalar degree front-end to achieve this goal, four topics are studied in this dissertation: 1. Increasing fetch bandwidth at the front-end entrant; 2. Identifying multiple instructions in one clock cycle; 3. Fetching super basic block instructions; 4. Storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors. In the first topic, increasing fetch bandwidth at the front-end entrant, we develop a new instruction prefetching method in which the prefetch is directed by the prediction on branches, called the branch instruction based (BIB) prefetching. Simulation results show that this design outperforms the traditional sequential prefetching by 7% and other prediction table based prefetching methods by 17% on average with the same BTB size. In the second topic, identifying multiple instructions in one clock cycle, we propose to use Instruction Identifier to predict instruction lengths and store the instruction pointers as superscalar instruction group indicators. Simulation results suggest that the Instruction Identifier with a 64-entry table is a good performance/cost choice. In the third topic, fetching super basic block instructions, we propose a design to improve instruction stream buffer performance by coupling it with the Branch Target Buffer (BTB) to support trace prediction. Compared with other existing designs, this instruction stream buffer can improve performance by 90% over current x86 processor instruction fetch rate on average. In the fourth topic, storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors, we propose an instruction PC Offset Queue. Two CISC hazards in the x86 architectures have been considered in this design, which reduce by 1/3 the storage space for a degree-5 superscalar x86 processor with even smaller access latency. Having dealt with the critical topics discussed in this dissertation, an efficient front-end of a high superscalar degree x86 micro-architecture becomes practical.
author2 Chung-Ping Chung
author_facet Chung-Ping Chung
Jih-ching Chiu
邱日清
author Jih-ching Chiu
邱日清
spellingShingle Jih-ching Chiu
邱日清
Exploiting X86 Front-end Parallelism with Program Trace Support
author_sort Jih-ching Chiu
title Exploiting X86 Front-end Parallelism with Program Trace Support
title_short Exploiting X86 Front-end Parallelism with Program Trace Support
title_full Exploiting X86 Front-end Parallelism with Program Trace Support
title_fullStr Exploiting X86 Front-end Parallelism with Program Trace Support
title_full_unstemmed Exploiting X86 Front-end Parallelism with Program Trace Support
title_sort exploiting x86 front-end parallelism with program trace support
publishDate 2002
url http://ndltd.ncl.edu.tw/handle/29519259549200384120
work_keys_str_mv AT jihchingchiu exploitingx86frontendparallelismwithprogramtracesupport
AT qiūrìqīng exploitingx86frontendparallelismwithprogramtracesupport
AT jihchingchiu yǐchéngshìguǐjīzhīyuánkāifāx86zhǐlìngjíchùlǐqìqiánduānbìngxíngfāngshì
AT qiūrìqīng yǐchéngshìguǐjīzhīyuánkāifāx86zhǐlìngjíchùlǐqìqiánduānbìngxíngfāngshì
_version_ 1718324428425658368