Summary: | 博士 === 國立臺灣大學 === 電機工程學系 === 85 === Copyright (c) 1990, Microsoft Corp 本論文提出了在視訊壓縮系統
三個有效率的移動估測架構,首先,針對全搜尋演算法提出一個具有二維
資料重覆使用的資料交錯架構,此架構以一維處理器單元陣列和移位暫存
器陣列為基礎,有效地減少外部資料存取的次數與接腳的數目,且能達到
高處理量,此外,相同的晶片能針對不同的區塊大小與搜尋範圍連接在一
起,而還能充分利用到資料重覆性。接著又針對三步驟階層搜尋演算法,
提出一個新的9-細胞具有資料環的陣列架構,由於有效率的資料環和記憶
體組織,規則的raster-scan 資料流和樹狀比較器結構能被應用去簡化內
部 輸入/輸出控制結構和減少延遲時間除此之外,減少外部記憶體存取,
和記憶體模組與處理單元之間的連線所使用的技術能被使用。另外,我們
也延伸資料環的概念到全搜尋演算法上,提出可比例(scalable)架構,
我們可以依據不同的視訊應用所要求的效能和各種演算法參數,如區塊大
小、搜尋範圍,畫面大小等去決定處理器單元的數目,以減少成本。我們
的結果顯示這些移動估測架構在視訊應用上,是一個具有低延遲時間、低
記憶體頻寬、低價格和高效能的架構。
In this dissertation, three efficient architectures are
presented for motion estimation in video compression systems.
First, a data-interlacing architecture with two-dimensional
(2-D)data-reuse for full-search block-matching algorithm is
proposed.Based on a one-dimensional processing element (PE)
array and two data-interlacing shift-register arrays,the
proposed architecture can efficiently reuse data to
decreaseexternal memory accesses and save the pin counts. It
also achieves 100% hardware utilizationand a high throughput
rate.In addition, the same chips can be cascaded for different
block sizes, search ranges, and pixel rates.Second, we proposed
an efficient 9-cells array architecture with data-ringsfor the
3-step hierarchical search block-matching algorithm. With the
efficient data-rings and memory organization, the regular
raster-scanned data flow and comparator-treestructure canbe used
to simplify control scheme and reduce latency, respectively. In
addition, we utilize a three-half-search-area scheme to reduce
external memory access and interconnection.It also provides a
high normalized throughput solution for the 3SHS.Finally, a
high-throughput scalable architecture for full-search block-
matching algorithm (FSBMA) is proposed.The number of processing
elements (PEs) isscalable according to the variable algorithm
parameters andthe performance required for different video
compression applications.By use of the efficient PE-rings and
the intelligentmemory-interleaving organization,the efficiency
of the architecture can be increased.Techniques for reducing
interconnections and external memory accesses are also
presented. Our results demonstrate that thesearchitectures are
flexible and high-performance solution for video applications.
|