Exploiting X86 Front-end Parallelism with Program Trace Support
博士 === 國立交通大學 === 資訊工程系 === 90 === The front-end units, the instruction stream buffer and the fetcher, are the key elements for achieving high instruction bandwidth. However, in x86 superscalar processors, the variable-length instructions and the complex addressing system make fetching mu...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2002
|
Online Access: | http://ndltd.ncl.edu.tw/handle/29519259549200384120 |
id |
ndltd-TW-090NCTU0392007 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-090NCTU03920072016-06-27T16:08:59Z http://ndltd.ncl.edu.tw/handle/29519259549200384120 Exploiting X86 Front-end Parallelism with Program Trace Support 以程式軌跡支援開發X86指令集處理器前端並行方式 Jih-ching Chiu 邱日清 博士 國立交通大學 資訊工程系 90 The front-end units, the instruction stream buffer and the fetcher, are the key elements for achieving high instruction bandwidth. However, in x86 superscalar processors, the variable-length instructions and the complex addressing system make fetching multiple instructions in a cycle difficult. To approach high instruction fetch bandwidth, keeping the streaming smooth and expanding the x86 instruction fetch degree are deeply considered with the relations of the program-execution trace. To build a high superscalar degree front-end to achieve this goal, four topics are studied in this dissertation: 1. Increasing fetch bandwidth at the front-end entrant; 2. Identifying multiple instructions in one clock cycle; 3. Fetching super basic block instructions; 4. Storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors. In the first topic, increasing fetch bandwidth at the front-end entrant, we develop a new instruction prefetching method in which the prefetch is directed by the prediction on branches, called the branch instruction based (BIB) prefetching. Simulation results show that this design outperforms the traditional sequential prefetching by 7% and other prediction table based prefetching methods by 17% on average with the same BTB size. In the second topic, identifying multiple instructions in one clock cycle, we propose to use Instruction Identifier to predict instruction lengths and store the instruction pointers as superscalar instruction group indicators. Simulation results suggest that the Instruction Identifier with a 64-entry table is a good performance/cost choice. In the third topic, fetching super basic block instructions, we propose a design to improve instruction stream buffer performance by coupling it with the Branch Target Buffer (BTB) to support trace prediction. Compared with other existing designs, this instruction stream buffer can improve performance by 90% over current x86 processor instruction fetch rate on average. In the fourth topic, storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors, we propose an instruction PC Offset Queue. Two CISC hazards in the x86 architectures have been considered in this design, which reduce by 1/3 the storage space for a degree-5 superscalar x86 processor with even smaller access latency. Having dealt with the critical topics discussed in this dissertation, an efficient front-end of a high superscalar degree x86 micro-architecture becomes practical. Chung-Ping Chung 鍾崇斌 2002 學位論文 ; thesis 140 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立交通大學 === 資訊工程系 === 90 === The front-end units, the instruction stream buffer and the fetcher, are the key elements for achieving high instruction bandwidth. However, in x86 superscalar processors, the variable-length instructions and the complex addressing system make fetching multiple instructions in a cycle difficult. To approach high instruction fetch bandwidth, keeping the streaming smooth and expanding the x86 instruction fetch degree are deeply considered with the relations of the program-execution trace. To build a high superscalar degree front-end to achieve this goal, four topics are studied in this dissertation:
1. Increasing fetch bandwidth at the front-end entrant;
2. Identifying multiple instructions in one clock cycle;
3. Fetching super basic block instructions;
4. Storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors.
In the first topic, increasing fetch bandwidth at the front-end entrant, we develop a new instruction prefetching method in which the prefetch is directed by the prediction on branches, called the branch instruction based (BIB) prefetching. Simulation results show that this design outperforms the traditional sequential prefetching by 7% and other prediction table based prefetching methods by 17% on average with the same BTB size. In the second topic, identifying multiple instructions in one clock cycle, we propose to use Instruction Identifier to predict instruction lengths and store the instruction pointers as superscalar instruction group indicators. Simulation results suggest that the Instruction Identifier with a 64-entry table is a good performance/cost choice. In the third topic, fetching super basic block instructions, we propose a design to improve instruction stream buffer performance by coupling it with the Branch Target Buffer (BTB) to support trace prediction. Compared with other existing designs, this instruction stream buffer can improve performance by 90% over current x86 processor instruction fetch rate on average. In the fourth topic, storing each instruction address for keeping processor states in high degree x86 instruction-fetched processors, we propose an instruction PC Offset Queue. Two CISC hazards in the x86 architectures have been considered in this design, which reduce by 1/3 the storage space for a degree-5 superscalar x86 processor with even smaller access latency.
Having dealt with the critical topics discussed in this dissertation, an efficient front-end of a high superscalar degree x86 micro-architecture becomes practical.
|
author2 |
Chung-Ping Chung |
author_facet |
Chung-Ping Chung Jih-ching Chiu 邱日清 |
author |
Jih-ching Chiu 邱日清 |
spellingShingle |
Jih-ching Chiu 邱日清 Exploiting X86 Front-end Parallelism with Program Trace Support |
author_sort |
Jih-ching Chiu |
title |
Exploiting X86 Front-end Parallelism with Program Trace Support |
title_short |
Exploiting X86 Front-end Parallelism with Program Trace Support |
title_full |
Exploiting X86 Front-end Parallelism with Program Trace Support |
title_fullStr |
Exploiting X86 Front-end Parallelism with Program Trace Support |
title_full_unstemmed |
Exploiting X86 Front-end Parallelism with Program Trace Support |
title_sort |
exploiting x86 front-end parallelism with program trace support |
publishDate |
2002 |
url |
http://ndltd.ncl.edu.tw/handle/29519259549200384120 |
work_keys_str_mv |
AT jihchingchiu exploitingx86frontendparallelismwithprogramtracesupport AT qiūrìqīng exploitingx86frontendparallelismwithprogramtracesupport AT jihchingchiu yǐchéngshìguǐjīzhīyuánkāifāx86zhǐlìngjíchùlǐqìqiánduānbìngxíngfāngshì AT qiūrìqīng yǐchéngshìguǐjīzhīyuánkāifāx86zhǐlìngjíchùlǐqìqiánduānbìngxíngfāngshì |
_version_ |
1718324428425658368 |