Dense Correspondence Annotation of Video Data Using Non-Rigid Registration with Salient Feature Correspondence Constraints

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === There are a few existing annotation systems that aim to provide a platform for video annotation. Most of them focus on activity annotation while others concentrate on labeling individual objects. However, the latters focus on only labeling objects with bounding...

Full description

Bibliographic Details
Main Authors: Yen-Ting Chen, 陳彥廷
Other Authors: Chieh-Chih Wang
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/68845600289225045103
Description
Summary:碩士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === There are a few existing annotation systems that aim to provide a platform for video annotation. Most of them focus on activity annotation while others concentrate on labeling individual objects. However, the latters focus on only labeling objects with bounding boxes or only using interpolation techniques to help user labeling. Moreover, only one of them try to find the dense correspondence inside the object contour. Issues of dense correspondences annotation across video frames are not well addressed yet. Inspired by this, a video annotation system that focuses on dense correspondences annotation inside the object contour is proposed in this work. In addition, since labeling detail object contour and dense correspondences across a whole video is a daunting task, we also minimize user''s effort by applying an interactive segmentation and tracking algorithm that utilizes information from optical flow and edges that helps the user easier to observe the salient feature correspondences between two video frames. Edges could help the user to find out the detail contour or local patterns of the object. The user is required to check and modify the salient feature correspondences obtained by the algorithm. Dense correspondences in the textureless region are extracted by a non-rigid registration algorithm from the salient feature correspondences verified by the user. The user only needs to label the first frame of the video and correct some minor errors in the subsequent frames for the whole video annotation. The result shows that the proposed framework is more suitable to label non-rigid objects.