Multiview Encoder Parallelized Fast Search Realization on NVIDIA CUDA

碩士 === 國立臺北科技大學 === 資訊工程系研究所 === 98 === Due to the rapid growth of the graphics processing unit (GPU) processing capability, it gets more and more popular to use it for non-graphics computations. NVIDIA announced a powerful GPU architecture called Compute Unified Device Architecture (CUDA) in 2007,...

Full description

Bibliographic Details
Main Authors: Chih-Te Lu, 盧志德
Other Authors: 楊士萱
Format: Others
Language:en_US
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/vdd5k4
Description
Summary:碩士 === 國立臺北科技大學 === 資訊工程系研究所 === 98 === Due to the rapid growth of the graphics processing unit (GPU) processing capability, it gets more and more popular to use it for non-graphics computations. NVIDIA announced a powerful GPU architecture called Compute Unified Device Architecture (CUDA) in 2007, which is able to provide massive data parallelism under the SIMD architecture constraint. We use NVIDIA GTX-280 GPU system, which has 240 computing cores, as the platform to implement a very complicated video coding scheme. The Multiview Video Coding (MVC) scheme, an extension of H.264/AVC/MPEG-4 Part 10 (AVC), is being developed by the international standard team joined by the ITU-T Video Coding Experts Group and the ISO/IEC JTC 1 Moving Pictures Experts Group (MPEG). It is an efficient video compression scheme; however, its computational compexity is very high. Two of its most time-consuming components are motion estimation (ME) and disparity estimation (DE). In this thesis, we propose a fast search algorithm, called multithreaded one-dimensional search (MODS). It can be used to do both the ME and the DE operations. We implement the integer-pel ME and DE processes with MODS on the GTX-280 platform. The speedup ratio can be 89 times faster than the CPU only configuration. Even when the fast search algorithm of the original JMVC is turned on, the MODS version on CUDA can still be 21 times faster.