Forwarding Path Limitation and Instruction Allocation for In-Order Processor with ALU Cascading

Much research focuses on many-core processors, which possess a vast number of cores. Their area, energy consumption, and performance have a tendency to be proportional to the number of cores. It is better to utilize in-order (IO) execution for better area/energy efficiency. However, expanding two-wa...

Full description

Bibliographic Details
Main Authors:	Ryotaro Kobayashi, Anri Suzuki, Hajime Shimada
Format:	Article
Language:	English
Published:	MDPI AG 2017-12-01
Series:	Journal of Low Power Electronics and Applications
Subjects:	ALU cascading in-order execution area/energy efficiency
Online Access:	https://www.mdpi.com/2079-9268/7/4/32

Description
Summary:	Much research focuses on many-core processors, which possess a vast number of cores. Their area, energy consumption, and performance have a tendency to be proportional to the number of cores. It is better to utilize in-order (IO) execution for better area/energy efficiency. However, expanding two-way IO to three-way IO offers very little improvement, since data dependency limits the effectiveness. In addition, if the core is changed from IO to out-of-order (OoO) execution to improve Instruction Per Cycle(IPC), area and energy consumption increases significantly. The combination of IO execution and Arithmetic Logic Unit(ALU) cascading is an effective solution to alleviate this problem. However, ALU cascading is implemented by complex bypass circuits because it requires a connection between all outputs and all inputs of all ALUs. The hardware complexity of the bypass circuits increases area, energy consumption, and delay. In this study, we proposed a mechanism that limits the number of the forwarding paths and allocates instructions to ALUs in accordance with the limited paths. This mechanism scales down bypass circuits to reduce the hardware complexity. Our evaluation results show that our proposed mechanism can reduce the area by 38.7%, the energy by 41.1%, and the delay by 23.2% with very little IPC loss on average, as compared with the conventional mechanism.
ISSN:	2079-9268

Forwarding Path Limitation and Instruction Allocation for In-Order Processor with ALU Cascading

Similar Items