Content-based retrieval of arbitrarily shaped video objects in the uncompressed and compressed domains

Advancements in video object segmentation technology and the availability of efficient object-based video representations, such as MPEG-4 [1], have resulted in the increased availability of arbitrarily shaped digital video content. While this enables many exciting applications, the process of loc...

Full description

Bibliographic Details
Main Author: Erol, Berna
Format: Others
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/12773
Description
Summary:Advancements in video object segmentation technology and the availability of efficient object-based video representations, such as MPEG-4 [1], have resulted in the increased availability of arbitrarily shaped digital video content. While this enables many exciting applications, the process of locating and accessing a desired video sequence can still be challenging because of the large volume of data associated with even compressed video. This dissertation proposes generic methods for the retrieval of arbitrarily shaped video objects in the MPEG-4 compressed domain, using their shape, local motion, and color content. Considering that a one-minute long video sequence may contain more than 1,500 frames, summarization of video content is necessary as a first step to efficiently retrieve video. Therefore, we first suggest a method for the summarization of arbitrarily shaped video objects. This is achieved by selecting the temporal instants of video objects -based on their compressed domain shape information- that efficiently represent the objects' salient content. Next, we propose to extend some well-proven still shape retrieval techniques to retrieve video objects in the compressed domain. We compute the Fourier and ART (Angular Radial Transform) descriptors on the shape approximations obtained from the MPEG-4 shape coding modes. We also present a method to compute the shape distances between two video objects based on these still shape features. Unlike in the case of still objects, one of the key features that describe a video object is motion. Classification of video objects by their local motion is addressed in this thesis by presenting three new motion descriptors. These descriptors are computed based on the shape deformations of arbitrarily shaped video, and assume no prior knowledge about the video content. Color is one of the most widely used low level features in content-based retrieval. In this thesis, we also study efficient color content matching of arbitrarily shaped video, and in particular, color histogram computation in the MPEG-4 compressed domain. Our experimental results demonstrate that our techniques enable effective and low complexity content-based retrieval. Employing MPEG-4 compressed domain information not only obviates the need for full decompression of the bit stream, hence yielding substantial computational savings, but also allows our techniques to be more robust to segmentation errors. === Applied Science, Faculty of === Electrical and Computer Engineering, Department of === Graduate