Accelerator for Flexible QR Decomposition and Back Substitution

abstract: QR decomposition (QRD) of a matrix is one of the most common linear algebra operationsused for the decomposition of a square/non-square matrix. It has a wide range of applications especially in Multiple Input-Multiple Output (MIMO) communication systems. Unfortunately it has high computati...

Full description

Bibliographic Details
Other Authors:	Kanagala, Srimayee (Author)
Format:	Dissertation
Language:	English
Published:	2020
Subjects:	Computer engineering Back Substitution Flexible Architecture Folded Architecture FPGA Hardware Accelerator QRD
Online Access:	http://hdl.handle.net/2286/R.I.63084

id	ndltd-asu.edu-item-63084
record_format	oai_dc
spelling	ndltd-asu.edu-item-630842021-01-15T05:01:21Z Accelerator for Flexible QR Decomposition and Back Substitution abstract: QR decomposition (QRD) of a matrix is one of the most common linear algebra operationsused for the decomposition of a square/non-square matrix. It has a wide range of applications especially in Multiple Input-Multiple Output (MIMO) communication systems. Unfortunately it has high computation complexity { for matrix size of nxn, QRD has O(n3) complexity and back substitution, which is used to solve a system of linear equations, has O(n2) complexity. Thus, as the matrix size increases, the hardware resource requirement for QRD and back substitution increases signicantly. This thesis presents the design and implementation of a exible QRD and back substitution accelerator using a folded architecture. It can support matrix sizes of 4x4, 8x8, 12x12, 16x16, and 20x20 with low hardware resource requirement. The proposed architecture is based on the systolic array implementation of the Givens algorithm for QRD. It is built with three dierent types of computation blocks which are connected in a 2-D array structure. These blocks are controlled by a scheduler which facilitates reusability of the blocks to perform computation for any input matrix size which is a multiple of 4. These blocks are designed using two basic programming elements which support both the forward and backward paths to compute matrix R in QRD and column-matrix X in back substitution computation. The proposed architecture has been mapped to Xilinx Zynq Ultrascale+ FPGA (Field Programmable Gate Array), ZCU102. All inputs are complex with precision of 40 bits (38 fractional bits and 1 signed bit). The architecture can be clocked at 50 MHz. The synthesis results of the folded architecture for dierent matrix sizes are presented. The results show that the folded architecture can support QRD and back substitution for inputs of large sizes which otherwise cannot t on an FPGA when implemented using a at architecture. The memory sizes required for dierent matrix sizes are also presented. Dissertation/Thesis Kanagala, Srimayee (Author) Chakrabarti, Chaitali (Advisor) Bliss, Daniel (Committee member) Cao, Yu (Kevin) (Committee member) Arizona State University (Publisher) Computer engineering Back Substitution Flexible Architecture Folded Architecture FPGA Hardware Accelerator QRD eng 68 pages Masters Thesis Electrical Engineering 2020 Masters Thesis http://hdl.handle.net/2286/R.I.63084 http://rightsstatements.org/vocab/InC/1.0/ 2020
collection	NDLTD
language	English
format	Dissertation
sources	NDLTD
topic	Computer engineering Back Substitution Flexible Architecture Folded Architecture FPGA Hardware Accelerator QRD
spellingShingle	Computer engineering Back Substitution Flexible Architecture Folded Architecture FPGA Hardware Accelerator QRD Accelerator for Flexible QR Decomposition and Back Substitution
description	abstract: QR decomposition (QRD) of a matrix is one of the most common linear algebra operationsused for the decomposition of a square/non-square matrix. It has a wide range of applications especially in Multiple Input-Multiple Output (MIMO) communication systems. Unfortunately it has high computation complexity { for matrix size of nxn, QRD has O(n3) complexity and back substitution, which is used to solve a system of linear equations, has O(n2) complexity. Thus, as the matrix size increases, the hardware resource requirement for QRD and back substitution increases signicantly. This thesis presents the design and implementation of a exible QRD and back substitution accelerator using a folded architecture. It can support matrix sizes of 4x4, 8x8, 12x12, 16x16, and 20x20 with low hardware resource requirement. The proposed architecture is based on the systolic array implementation of the Givens algorithm for QRD. It is built with three dierent types of computation blocks which are connected in a 2-D array structure. These blocks are controlled by a scheduler which facilitates reusability of the blocks to perform computation for any input matrix size which is a multiple of 4. These blocks are designed using two basic programming elements which support both the forward and backward paths to compute matrix R in QRD and column-matrix X in back substitution computation. The proposed architecture has been mapped to Xilinx Zynq Ultrascale+ FPGA (Field Programmable Gate Array), ZCU102. All inputs are complex with precision of 40 bits (38 fractional bits and 1 signed bit). The architecture can be clocked at 50 MHz. The synthesis results of the folded architecture for dierent matrix sizes are presented. The results show that the folded architecture can support QRD and back substitution for inputs of large sizes which otherwise cannot t on an FPGA when implemented using a at architecture. The memory sizes required for dierent matrix sizes are also presented. === Dissertation/Thesis === Masters Thesis Electrical Engineering 2020
author2	Kanagala, Srimayee (Author)
author_facet	Kanagala, Srimayee (Author)
title	Accelerator for Flexible QR Decomposition and Back Substitution
title_short	Accelerator for Flexible QR Decomposition and Back Substitution
title_full	Accelerator for Flexible QR Decomposition and Back Substitution
title_fullStr	Accelerator for Flexible QR Decomposition and Back Substitution
title_full_unstemmed	Accelerator for Flexible QR Decomposition and Back Substitution
title_sort	accelerator for flexible qr decomposition and back substitution
publishDate	2020
url	http://hdl.handle.net/2286/R.I.63084
_version_	1719373039484796928

Accelerator for Flexible QR Decomposition and Back Substitution

Similar Items