Cluster assignment and instruction scheduling for partitioned register-set machines

For half a century, computer architects have been striving to improve uniprocessor computer performance. Many of their successful designs such as VLIW and superscalar machines use multiple functional units trying to exploit instruction level parallelism in computer programs. As the number of functi...

Full description

Bibliographic Details
Main Author: He, Jingsong
Other Authors: Cooper, Keith D.
Format: Others
Language:English
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/1911/17340
Description
Summary:For half a century, computer architects have been striving to improve uniprocessor computer performance. Many of their successful designs such as VLIW and superscalar machines use multiple functional units trying to exploit instruction level parallelism in computer programs. As the number of functional units rises, another hardware constraint enters the picture---the number of register-file ports needed grows directly with the number of functional units. At some point, the multiplexing logic on register ports can come to dominate the processor's cycle time. A reasonable solution is to partition the register file into independent sets and associate each functional unit with a specific register set. Such partitioned register sets have appeared in a number of commercial machines, such as Texas Instruments TMS320C6xxx DSP chips. Partitioned register-set architectures present a new set of challenges to compiler designers---the compiler must assign each operation to a specific clusters and coordinate data movement between clusters. In this thesis, we investigate five instruction scheduling methods with different scopes to find a suitable one for partitioned register-set architectures. Next, we examine previous algorithms for the combined cluster assignment and scheduling problem and propose two new algorithms that improve upon the prior art. Then we study the difficulties introduced by limited number of registers and provide an approach to handle them. Finally we take several other measurements of partitioned register-set architectures that may shed light on some of the architectural decisions.