A Memory-Efficient and High-Performance Architecture for the 5/3 and 9/7 Discrete Wavelet Transform in JPEG2000 Application

碩士 === 國立中興大學 === 電機工程學系所 === 94 === A memory-efficient and high-performance architecture which performs two-dimension forward and inverse discrete wavelet transform (DWT) for the set of filters in JPEG-2000 is proposed by line-memory and modifying lifting scheme. The architecture consists of three...

Full description

Bibliographic Details
Main Authors: Yui-Chih Shih, 施友植
Other Authors: 賴永康
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/38491192942729335475
Description
Summary:碩士 === 國立中興大學 === 電機工程學系所 === 94 === A memory-efficient and high-performance architecture which performs two-dimension forward and inverse discrete wavelet transform (DWT) for the set of filters in JPEG-2000 is proposed by line-memory and modifying lifting scheme. The architecture consists of three main components which includes the column processor, the row processor, and the transposing buffers. The column processor contains two multipliers, four adders, nine registers, and 2N-length tile line memory. It needs four registers in place of memory for the transposing buffers. The row processor contains two multipliers, four adders, and eleven registers. Under the same arithmetic resources required for the primitive algorithm , the number of pipeline registers are reduced from 32 to 20. The architecture which we proposed can reduce the critical path only one multiplier delay. The precision of the multipliers and adders has been determined using extensive simulation. Depending on the simulation result, it is reasonable to choose 11 integer bits and 5 fraction bits for the fixed-point representation to avoid the overflow problem. Because it uses registers to take the place of the memory, we can reduce the requirements of memory. The whole architecture which is optimized in the pipelining way with modifying lifting scheme can speed up and achieve higher hardware utilization. Two samples per clock can be encoded at 100MHz. The architecture can be used as a compact and independent IP core for JPEG-2000 and various real-time image/video applications.