Image Processing using Approximate Data-path Units
abstract: In this work, we present approximate adders and multipliers to reduce data-path complexity of specialized hardware for various image processing systems. These approximate circuits have a lower area, latency and power consumption compared to their accurate counterparts and produce fairly ac...
Other Authors: | |
---|---|
Format: | Dissertation |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | http://hdl.handle.net/2286/R.I.20990 |
id |
ndltd-asu.edu-item-20990 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-asu.edu-item-209902018-06-22T03:04:42Z Image Processing using Approximate Data-path Units abstract: In this work, we present approximate adders and multipliers to reduce data-path complexity of specialized hardware for various image processing systems. These approximate circuits have a lower area, latency and power consumption compared to their accurate counterparts and produce fairly accurate results. We build upon the work on approximate adders and multipliers presented in [23] and [24]. First, we show how choice of algorithm and parallel adder design can be used to implement 2D Discrete Cosine Transform (DCT) algorithm with good performance but low area. Our implementation of the 2D DCT has comparable PSNR performance with respect to the algorithm presented in [23] with ~35-50% reduction in area. Next, we use the approximate 2x2 multiplier presented in [24] to implement parallel approximate multipliers. We demonstrate that if some of the 2x2 multipliers in the design of the parallel multiplier are accurate, the accuracy of the multiplier improves significantly, especially when two large numbers are multiplied. We choose Gaussian FIR Filter and Fast Fourier Transform (FFT) algorithms to illustrate the efficacy of our proposed approximate multiplier. We show that application of the proposed approximate multiplier improves the PSNR performance of 32x32 FFT implementation by 4.7 dB compared to the implementation using the approximate multiplier described in [24]. We also implement a state-of-the-art image enlargement algorithm, namely Segment Adaptive Gradient Angle (SAGA) [29], in hardware. The algorithm is mapped to pipelined hardware blocks and we synthesized the design using 90 nm technology. We show that a 64x64 image can be processed in 496.48 µs when clocked at 100 MHz. The average PSNR performance of our implementation using accurate parallel adders and multipliers is 31.33 dB and that using approximate parallel adders and multipliers is 30.86 dB, when evaluated against the original image. The PSNR performance of both designs is comparable to the performance of the double precision floating point MATLAB implementation of the algorithm. Dissertation/Thesis Vasudevan, Madhu (Author) Chakrabarti, Chaitali (Advisor) Frakes, David (Committee member) Gupta, Sandeep (Committee member) Arizona State University (Publisher) Electrical engineering approximate image processing low power eng 66 pages M.S. Computer Science 2013 Masters Thesis http://hdl.handle.net/2286/R.I.20990 http://rightsstatements.org/vocab/InC/1.0/ All Rights Reserved 2013 |
collection |
NDLTD |
language |
English |
format |
Dissertation |
sources |
NDLTD |
topic |
Electrical engineering approximate image processing low power |
spellingShingle |
Electrical engineering approximate image processing low power Image Processing using Approximate Data-path Units |
description |
abstract: In this work, we present approximate adders and multipliers to reduce data-path complexity of specialized hardware for various image processing systems. These approximate circuits have a lower area, latency and power consumption compared to their accurate counterparts and produce fairly accurate results. We build upon the work on approximate adders and multipliers presented in [23] and [24]. First, we show how choice of algorithm and parallel adder design can be used to implement 2D Discrete Cosine Transform (DCT) algorithm with good performance but low area. Our implementation of the 2D DCT has comparable PSNR performance with respect to the algorithm presented in [23] with ~35-50% reduction in area. Next, we use the approximate 2x2 multiplier presented in [24] to implement parallel approximate multipliers. We demonstrate that if some of the 2x2 multipliers in the design of the parallel multiplier are accurate, the accuracy of the multiplier improves significantly, especially when two large numbers are multiplied. We choose Gaussian FIR Filter and Fast Fourier Transform (FFT) algorithms to illustrate the efficacy of our proposed approximate multiplier. We show that application of the proposed approximate multiplier improves the PSNR performance of 32x32 FFT implementation by 4.7 dB compared to the implementation using the approximate multiplier described in [24]. We also implement a state-of-the-art image enlargement algorithm, namely Segment Adaptive Gradient Angle (SAGA) [29], in hardware. The algorithm is mapped to pipelined hardware blocks and we synthesized the design using 90 nm technology. We show that a 64x64 image can be processed in 496.48 µs when clocked at 100 MHz. The average PSNR performance of our implementation using accurate parallel adders and multipliers is 31.33 dB and that using approximate parallel adders and multipliers is 30.86 dB, when evaluated against the original image. The PSNR performance of both designs is comparable to the performance of the double precision floating point MATLAB implementation of the algorithm. === Dissertation/Thesis === M.S. Computer Science 2013 |
author2 |
Vasudevan, Madhu (Author) |
author_facet |
Vasudevan, Madhu (Author) |
title |
Image Processing using Approximate Data-path Units |
title_short |
Image Processing using Approximate Data-path Units |
title_full |
Image Processing using Approximate Data-path Units |
title_fullStr |
Image Processing using Approximate Data-path Units |
title_full_unstemmed |
Image Processing using Approximate Data-path Units |
title_sort |
image processing using approximate data-path units |
publishDate |
2013 |
url |
http://hdl.handle.net/2286/R.I.20990 |
_version_ |
1718700295127564288 |