Summary: | 碩士 === 國立清華大學 === 資訊工程學系所 === 105 === Computer vision is important for autonomous cars to detect or track the object nearby such as people, vehicles or animals. However, there are many problems in visual tracking including illumination variation, deformation and occlusion, etc. To deal with these complicated problems, the appearance model is utilized to describe the target, and the discriminative model is adopted to classify a candidate whether it is the target or the background.
In this thesis, we propose a tracking method based on Edge-boxes, which provides a small but high-quality set of proposals based on edges. In addition, a pre-trained Convolutional Neural Network (CNN) is used to extract the feature from an image patch, and then compute the cost functions of the appearance, motion and size. With these costs, online Support Vector Machine (SVM) is adopted to be a classifier instead of simple computation of the sum of the costs. Finally, we maintain our tracker by updating templates, predicted state, and SVM.
The experimental results demonstrate that the proposed method performs well in videos existing many challenges. Since the pre-trained CNN extracts general features of targets, the illumination variation and deformation problems can be easily solved, and the success rate can be up to 96% under the overlap threshold 0.5. The tracking failure is caused by occlusion in the videos can be also avoided due to high-quality proposals generated by Edge-boxes and templates stored in previous frames, and the MOTA can be up to 95.238%.
|