Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays

Soft-core embedded systems implemented on FPGAs offer a high level of flexibility. Application specific customizations can be added in the form of extensions to the processor’s regular instruction-set. These custom instructions benefit run-time performance, but come at the cost of increased resource...

Full description

Bibliographic Details
Main Author:	Johnston, Erin
Language:	English
Published:	University of British Columbia 2012
Online Access:	http://hdl.handle.net/2429/43695

id	ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-43695
record_format	oai_dc
spelling	ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-436952014-03-26T03:39:11Z Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays Johnston, Erin Soft-core embedded systems implemented on FPGAs offer a high level of flexibility. Application specific customizations can be added in the form of extensions to the processor’s regular instruction-set. These custom instructions benefit run-time performance, but come at the cost of increased resource usage. Reducing the overall FPGA area required to implement a system will decrease static power consumption and allow a smaller, cheaper device to be used. There is a constant effort to reduce area and power consumption while maintaining performance benefits attained through customizations. This thesis presents a new architecture to share custom instruction units among multiple processors in a system. This implementation allows run-time performance benefits to be maintained while decreasing the overall resource usage. The shared architecture is implemented using an arbitrator to determine processor access to each custom instruction in a set. Custom instruction inputs and outputs are controlled using additional multiplexors and selection hardware. Results for a sample system using fine-grained custom instructions show that sharing can reduce the implementation area by up to 24% with minimal impact to the critical path delay. This reduction remains high at 19% for a coarse-grained case study of an encryption algorithm called SHA. The custom instruction configuration depends on the application being performed. A benchmark generator and simulator are also developed to evaluate candidates for custom instruction implementation and efficiently explore the design space. The overall run-time performance of the candidate systems can also be evaluated using these tools. The simulator can also be used with an input trace to determine cycle accurate run-time performance for a real application, without requiring the entire system to be designed and implemented in hardware. The simulator shows up to 53% run-time improvement for a shared fine-grained system over a system with no custom instructions. Hardware run-time results for the coarsegrained case study improve run-time up to 13.5% over a system with no custom instructions. 2012-12-13T21:18:58Z 2012-12-13T21:18:58Z 2012 2012-12-13 2013-05 Electronic Thesis or Dissertation http://hdl.handle.net/2429/43695 eng University of British Columbia
collection	NDLTD
language	English
sources	NDLTD
description	Soft-core embedded systems implemented on FPGAs offer a high level of flexibility. Application specific customizations can be added in the form of extensions to the processor’s regular instruction-set. These custom instructions benefit run-time performance, but come at the cost of increased resource usage. Reducing the overall FPGA area required to implement a system will decrease static power consumption and allow a smaller, cheaper device to be used. There is a constant effort to reduce area and power consumption while maintaining performance benefits attained through customizations. This thesis presents a new architecture to share custom instruction units among multiple processors in a system. This implementation allows run-time performance benefits to be maintained while decreasing the overall resource usage. The shared architecture is implemented using an arbitrator to determine processor access to each custom instruction in a set. Custom instruction inputs and outputs are controlled using additional multiplexors and selection hardware. Results for a sample system using fine-grained custom instructions show that sharing can reduce the implementation area by up to 24% with minimal impact to the critical path delay. This reduction remains high at 19% for a coarse-grained case study of an encryption algorithm called SHA. The custom instruction configuration depends on the application being performed. A benchmark generator and simulator are also developed to evaluate candidates for custom instruction implementation and efficiently explore the design space. The overall run-time performance of the candidate systems can also be evaluated using these tools. The simulator can also be used with an input trace to determine cycle accurate run-time performance for a real application, without requiring the entire system to be designed and implemented in hardware. The simulator shows up to 53% run-time improvement for a shared fine-grained system over a system with no custom instructions. Hardware run-time results for the coarsegrained case study improve run-time up to 13.5% over a system with no custom instructions.
author	Johnston, Erin
spellingShingle	Johnston, Erin Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
author_facet	Johnston, Erin
author_sort	Johnston, Erin
title	Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_short	Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_full	Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_fullStr	Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_full_unstemmed	Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
title_sort	shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays
publisher	University of British Columbia
publishDate	2012
url	http://hdl.handle.net/2429/43695
work_keys_str_mv	AT johnstonerin sharedinstructionsetextensionsforsoftmultiprocessorsystemsimplementedonfieldprogrammablegatearrays
_version_	1716656556069617664

Shared instruction-set extensions for soft multiprocessor systems implemented on field-programmable gate arrays

Similar Items