Key-Frame Extraction for Video Summarization and Shot-Based Video Retrieval

碩士 === 國立中正大學 === 資訊工程研究所 === 92 === In this paper, we present an adaptive rate-constrained key-frame selection scheme for channel-aware realtime video streaming and shot-based video retrieval. First, the streaming server dynamically determines the target number of key-frames by estimatin...

Full description

Bibliographic Details
Main Authors: Ho, Yu-Hsuan, 何宥萱
Other Authors: Lin, Chia-Wen
Format: Others
Language:zh-TW
Published: 2004
Online Access:http://ndltd.ncl.edu.tw/handle/73943136238044640470
Description
Summary:碩士 === 國立中正大學 === 資訊工程研究所 === 92 === In this paper, we present an adaptive rate-constrained key-frame selection scheme for channel-aware realtime video streaming and shot-based video retrieval. First, the streaming server dynamically determines the target number of key-frames by estimating the channel conditions according to the feedback information. Under the constraint of the target key-frame number, a two-step sequential key-frame selection scheme is adopted to select the target number of key-frames by first finding the optimal allocation among the video shots in a video clip, and then selecting most representative key-frames in each shot according to the allocation to guide the temporal-downscaling transcoding. After extracting the key-frames, we propose a multi-pass video retrieval using spatio-temporal statistics information. In the first-pass, the probability distributions of object motion for each shot of the query video clip are extracted and then are compared with the probability of the shots in the database by using the Bhattacharyya distance. In the second-pass, two consecutive shots are employed to the introduction of the “causality” effect. Finally, in the refinement-pass, we extract one key-frame from each shot using our key-frame selection method, and calculate the color histogram of each key-frame. Then we use the Bhattacharya distance to compare the similarity of the two color histograms of key-frames and cumulate the second-stage distance to be the similarity of two video shots. Without respect to the two-step key-frame selection or multi-pass video retrieval, our experimental results show that the proposed methods are efficient and satisfactory.