Accelerator for Flexible QR Decomposition and Back Substitution
abstract: QR decomposition (QRD) of a matrix is one of the most common linear algebra operationsused for the decomposition of a square/non-square matrix. It has a wide range of applications especially in Multiple Input-Multiple Output (MIMO) communication systems. Unfortunately it has high computati...
Other Authors: | |
---|---|
Format: | Dissertation |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | http://hdl.handle.net/2286/R.I.63084 |
id |
ndltd-asu.edu-item-63084 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-asu.edu-item-630842021-01-15T05:01:21Z Accelerator for Flexible QR Decomposition and Back Substitution abstract: QR decomposition (QRD) of a matrix is one of the most common linear algebra operationsused for the decomposition of a square/non-square matrix. It has a wide range of applications especially in Multiple Input-Multiple Output (MIMO) communication systems. Unfortunately it has high computation complexity { for matrix size of nxn, QRD has O(n3) complexity and back substitution, which is used to solve a system of linear equations, has O(n2) complexity. Thus, as the matrix size increases, the hardware resource requirement for QRD and back substitution increases signicantly. This thesis presents the design and implementation of a exible QRD and back substitution accelerator using a folded architecture. It can support matrix sizes of 4x4, 8x8, 12x12, 16x16, and 20x20 with low hardware resource requirement. The proposed architecture is based on the systolic array implementation of the Givens algorithm for QRD. It is built with three dierent types of computation blocks which are connected in a 2-D array structure. These blocks are controlled by a scheduler which facilitates reusability of the blocks to perform computation for any input matrix size which is a multiple of 4. These blocks are designed using two basic programming elements which support both the forward and backward paths to compute matrix R in QRD and column-matrix X in back substitution computation. The proposed architecture has been mapped to Xilinx Zynq Ultrascale+ FPGA (Field Programmable Gate Array), ZCU102. All inputs are complex with precision of 40 bits (38 fractional bits and 1 signed bit). The architecture can be clocked at 50 MHz. The synthesis results of the folded architecture for dierent matrix sizes are presented. The results show that the folded architecture can support QRD and back substitution for inputs of large sizes which otherwise cannot t on an FPGA when implemented using a at architecture. The memory sizes required for dierent matrix sizes are also presented. Dissertation/Thesis Kanagala, Srimayee (Author) Chakrabarti, Chaitali (Advisor) Bliss, Daniel (Committee member) Cao, Yu (Kevin) (Committee member) Arizona State University (Publisher) Computer engineering Back Substitution Flexible Architecture Folded Architecture FPGA Hardware Accelerator QRD eng 68 pages Masters Thesis Electrical Engineering 2020 Masters Thesis http://hdl.handle.net/2286/R.I.63084 http://rightsstatements.org/vocab/InC/1.0/ 2020 |
collection |
NDLTD |
language |
English |
format |
Dissertation |
sources |
NDLTD |
topic |
Computer engineering Back Substitution Flexible Architecture Folded Architecture FPGA Hardware Accelerator QRD |
spellingShingle |
Computer engineering Back Substitution Flexible Architecture Folded Architecture FPGA Hardware Accelerator QRD Accelerator for Flexible QR Decomposition and Back Substitution |
description |
abstract: QR decomposition (QRD) of a matrix is one of the most common linear algebra operationsused for the decomposition of a square/non-square matrix. It has a wide range
of applications especially in Multiple Input-Multiple Output (MIMO) communication
systems. Unfortunately it has high computation complexity { for matrix size of nxn,
QRD has O(n3) complexity and back substitution, which is used to solve a system
of linear equations, has O(n2) complexity. Thus, as the matrix size increases, the
hardware resource requirement for QRD and back substitution increases signicantly.
This thesis presents the design and implementation of a
exible QRD and back substitution accelerator using a folded architecture. It can support matrix sizes of
4x4, 8x8, 12x12, 16x16, and 20x20 with low hardware resource requirement.
The proposed architecture is based on the systolic array implementation of the
Givens algorithm for QRD. It is built with three dierent types of computation blocks
which are connected in a 2-D array structure. These blocks are controlled by a
scheduler which facilitates reusability of the blocks to perform computation for any
input matrix size which is a multiple of 4. These blocks are designed using two
basic programming elements which support both the forward and backward paths to
compute matrix R in QRD and column-matrix X in back substitution computation.
The proposed architecture has been mapped to Xilinx Zynq Ultrascale+ FPGA
(Field Programmable Gate Array), ZCU102. All inputs are complex with precision
of 40 bits (38 fractional bits and 1 signed bit). The architecture can be clocked at
50 MHz. The synthesis results of the folded architecture for dierent matrix sizes
are presented. The results show that the folded architecture can support QRD and
back substitution for inputs of large sizes which otherwise cannot t on an FPGA
when implemented using a
at architecture. The memory sizes required for dierent
matrix sizes are also presented. === Dissertation/Thesis === Masters Thesis Electrical Engineering 2020 |
author2 |
Kanagala, Srimayee (Author) |
author_facet |
Kanagala, Srimayee (Author) |
title |
Accelerator for Flexible QR Decomposition and Back Substitution |
title_short |
Accelerator for Flexible QR Decomposition and Back Substitution |
title_full |
Accelerator for Flexible QR Decomposition and Back Substitution |
title_fullStr |
Accelerator for Flexible QR Decomposition and Back Substitution |
title_full_unstemmed |
Accelerator for Flexible QR Decomposition and Back Substitution |
title_sort |
accelerator for flexible qr decomposition and back substitution |
publishDate |
2020 |
url |
http://hdl.handle.net/2286/R.I.63084 |
_version_ |
1719373039484796928 |