Summary: | 碩士 === 國立臺灣大學 === 電信工程學研究所 === 93 === With the rapid growth in multimedia and communication technology, we are brought into a new era surrounded by a great amount of multimedia data. Such an amount has exceeded our capacities, and we need mechanisms to find the important content and discard the less important content. In this thesis, we study the analysis and summarization
of videos. This thesis is composes of five chapters. In Chapter 1, we discuss the emerging issues in the digital media world, and introduce our expected goals. In Chapter 2, we give a review to our previous works on tennis and baseball scene detection and classification, and expand the works by considering extra information sources. We make use of audio and text to achieve high-level demands, capturing the audience cheering on the tennis court, and the game status of a baseball game. Such capabilities enable us to define rules to evaluate the importance of each moment, and generate meaningful summaries. In Chapter 3, we present a scheme to detect slow-motion replay segments in the compressed domain. Slow-motion replay exhibits particular motion-vector patterns and larges variation in the frame difference. We use the statistics of motion vectors and relation among DCT blocks to determine the occurrence of slow-motion replay segments rapidly. As slow-motion replays appear in almost all kinds of sports, they are highly helpful in summarization. In Chapter 4, we explore the possibilities of video analysis outside the sports domain. We show a framework to detect the major faces in video, including tracking, chopping, and clustering, and use the major faces to analyze the video in depth. This framework enables the character-customized summarization. We also propose a method to filter out the undesired commercial segments by the visually perceptual difference. In Chapter 5, we conclude this thesis and discuss our possible future directions.
|