Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays

Soft-core embedded systems implemented on FPGAs offer a high level of flexibility. Application specific customizations can be added in the form of extensions to the processor’s regular instruction-set. These custom instructions benefit run-time performance, but come at the cost of increased resource...

Full description

Bibliographic Details
Main Author: Johnston, Erin
Language:English
Published: University of British Columbia 2012
Online Access:http://hdl.handle.net/2429/43695
id ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-43695
record_format oai_dc
spelling ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-436952014-03-26T03:39:11Z Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays Johnston, Erin Soft-core embedded systems implemented on FPGAs offer a high level of flexibility. Application specific customizations can be added in the form of extensions to the processor’s regular instruction-set. These custom instructions benefit run-time performance, but come at the cost of increased resource usage. Reducing the overall FPGA area required to implement a system will decrease static power consumption and allow a smaller, cheaper device to be used. There is a constant effort to reduce area and power consumption while maintaining performance benefits attained through customizations. This thesis presents a new architecture to share custom instruction units among multiple processors in a system. This implementation allows run-time performance benefits to be maintained while decreasing the overall resource usage. The shared architecture is implemented using an arbitrator to determine processor access to each custom instruction in a set. Custom instruction inputs and outputs are controlled using additional multiplexors and selection hardware. Results for a sample system using fine-grained custom instructions show that sharing can reduce the implementation area by up to 24% with minimal impact to the critical path delay. This reduction remains high at 19% for a coarse-grained case study of an encryption algorithm called SHA. The custom instruction configuration depends on the application being performed. A benchmark generator and simulator are also developed to evaluate candidates for custom instruction implementation and efficiently explore the design space. The overall run-time performance of the candidate systems can also be evaluated using these tools. The simulator can also be used with an input trace to determine cycle accurate run-time performance for a real application, without requiring the entire system to be designed and implemented in hardware. The simulator shows up to 53% run-time improvement for a shared fine-grained system over a system with no custom instructions. Hardware run-time results for the coarsegrained case study improve run-time up to 13.5% over a system with no custom instructions. 2012-12-13T21:18:58Z 2012-12-13T21:18:58Z 2012 2012-12-13 2013-05 Electronic Thesis or Dissertation http://hdl.handle.net/2429/43695 eng University of British Columbia
collection NDLTD
language English
sources NDLTD
description Soft-core embedded systems implemented on FPGAs offer a high level of flexibility. Application specific customizations can be added in the form of extensions to the processor’s regular instruction-set. These custom instructions benefit run-time performance, but come at the cost of increased resource usage. Reducing the overall FPGA area required to implement a system will decrease static power consumption and allow a smaller, cheaper device to be used. There is a constant effort to reduce area and power consumption while maintaining performance benefits attained through customizations. This thesis presents a new architecture to share custom instruction units among multiple processors in a system. This implementation allows run-time performance benefits to be maintained while decreasing the overall resource usage. The shared architecture is implemented using an arbitrator to determine processor access to each custom instruction in a set. Custom instruction inputs and outputs are controlled using additional multiplexors and selection hardware. Results for a sample system using fine-grained custom instructions show that sharing can reduce the implementation area by up to 24% with minimal impact to the critical path delay. This reduction remains high at 19% for a coarse-grained case study of an encryption algorithm called SHA. The custom instruction configuration depends on the application being performed. A benchmark generator and simulator are also developed to evaluate candidates for custom instruction implementation and efficiently explore the design space. The overall run-time performance of the candidate systems can also be evaluated using these tools. The simulator can also be used with an input trace to determine cycle accurate run-time performance for a real application, without requiring the entire system to be designed and implemented in hardware. The simulator shows up to 53% run-time improvement for a shared fine-grained system over a system with no custom instructions. Hardware run-time results for the coarsegrained case study improve run-time up to 13.5% over a system with no custom instructions.
author Johnston, Erin
spellingShingle Johnston, Erin
Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
author_facet Johnston, Erin
author_sort Johnston, Erin
title Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_short Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_full Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_fullStr Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_full_unstemmed Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_sort shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
publisher University of British Columbia
publishDate 2012
url http://hdl.handle.net/2429/43695
work_keys_str_mv AT johnstonerin sharedinstructionsetextensionsforsoftmultiprocessorsystemsimplementedonfieldprogrammablegatearrays
_version_ 1716656556069617664