Summary: | 碩士 === 元智大學 === 資訊工程學系 === 99 === In terms of streaming media and internet are used more and more frequently, the era of e-Learning emerges. In the e-learning system, learners can access lecture videos no matter when and where. Thus, it is imperative to provide an effective method to retrieve lecture videos conveniently and friendly. In the thesis, text extraction for lecture videos with complicated background is proposed to facilitate lecture video retrieval using textual keywords.
Since background of lecture videos may be rather complicated and fancy, in particular may have textual characteristics, foreground segmentation method is designed to extract texts region. On the other hand, since the resolution of lecture videos is generally low, how to enhance the quality of texts to facilitate the consequent text recognition is the other issue in this thesis.
First, temporal analysis of lecture videos is performed to detect slide transitions. The frames corresponding to those frames between slide transitions are then merged into a key frame to represent the slide. The consequent process can then be applied to the key frame only so as to reduce computing time. Second, local features are extracted from block partition of slide-like key frame, based on which background model are generated followed by foreground extraction. Finally, for each text region extracted from foregrounds, quality improvement and adaptive binarization are employed to facilitate consequent optical character recognition. The recognition accuracy rate is used to evaluate the performance of the proposed method and to compare with existing methods. Various experiments prove that the effectiveness and feasibility of our method.
|