Study of Architecture Design for Motion Estimation Algorithms

Study of Architecture Design for Motion Estimation Algorithms

博士 === 國立臺灣大學 === 電機工程學系 === 85 === Copyright (c) 1990, Microsoft Corp 本論文提出了在視訊壓縮系統 三個有效率的移動估測架構，首先，針對全搜尋演算法提出一個具有二維資料重覆使用的資料交錯架構，此架構以一維處理器單元陣列和移位暫存器陣列為基礎，有效地減少外部資料存取的次數與接腳的數目，且能達到高處理量，此外，相同的晶片能針對不同的區塊大小與搜尋範圍連接在一起，而還能充分...

Full description

Bibliographic Details
Main Authors:	Lai, Yeong-Kang, 賴永康
Other Authors:	Liang-Gee Chen
Format:	Others
Language:	zh-TW
Published:	1997
Online Access:	http://ndltd.ncl.edu.tw/handle/72697216647614769854

Description
Summary:	博士 === 國立臺灣大學 === 電機工程學系 === 85 === Copyright (c) 1990, Microsoft Corp 本論文提出了在視訊壓縮系統 三個有效率的移動估測架構，首先，針對全搜尋演算法提出一個具有二維資料重覆使用的資料交錯架構，此架構以一維處理器單元陣列和移位暫存器陣列為基礎，有效地減少外部資料存取的次數與接腳的數目，且能達到高處理量，此外，相同的晶片能針對不同的區塊大小與搜尋範圍連接在一起，而還能充分利用到資料重覆性。接著又針對三步驟階層搜尋演算法，提出一個新的9-細胞具有資料環的陣列架構，由於有效率的資料環和記憶體組織，規則的raster-scan 資料流和樹狀比較器結構能被應用去簡化內部輸入/輸出控制結構和減少延遲時間除此之外，減少外部記憶體存取，和記憶體模組與處理單元之間的連線所使用的技術能被使用。另外，我們也延伸資料環的概念到全搜尋演算法上，提出可比例（scalable）架構，我們可以依據不同的視訊應用所要求的效能和各種演算法參數，如區塊大小、搜尋範圍，畫面大小等去決定處理器單元的數目，以減少成本。我們的結果顯示這些移動估測架構在視訊應用上，是一個具有低延遲時間、低記憶體頻寬、低價格和高效能的架構。 In this dissertation, three efficient architectures are presented for motion estimation in video compression systems. First, a data-interlacing architecture with two-dimensional (2-D)data-reuse for full-search block-matching algorithm is proposed.Based on a one-dimensional processing element (PE) array and two data-interlacing shift-register arrays,the proposed architecture can efficiently reuse data to decreaseexternal memory accesses and save the pin counts. It also achieves 100% hardware utilizationand a high throughput rate.In addition, the same chips can be cascaded for different block sizes, search ranges, and pixel rates.Second, we proposed an efficient 9-cells array architecture with data-ringsfor the 3-step hierarchical search block-matching algorithm. With the efficient data-rings and memory organization, the regular raster-scanned data flow and comparator-treestructure canbe used to simplify control scheme and reduce latency, respectively. In addition, we utilize a three-half-search-area scheme to reduce external memory access and interconnection.It also provides a high normalized throughput solution for the 3SHS.Finally, a high-throughput scalable architecture for full-search block- matching algorithm (FSBMA) is proposed.The number of processing elements (PEs) isscalable according to the variable algorithm parameters andthe performance required for different video compression applications.By use of the efficient PE-rings and the intelligentmemory-interleaving organization,the efficiency of the architecture can be increased.Techniques for reducing interconnections and external memory accesses are also presented. Our results demonstrate that thesearchitectures are flexible and high-performance solution for video applications.

Cannot write session to /tmp/vufind_sessions/sess_50crsepb6eimm6j9t37udn3ks4