Summary: | 博士 === 國立中正大學 === 資訊工程研究所 === 104 === Scene text detectors usually consist of preprocessing, feature collection, candidate text block generation units, and text block detection. In this dissertation, we propose a robust text detector with the four new proposed units for detecting text strings in scene images. In preprocessing, both resizing and low-complexity are first used for easily predefining a general range of text sizes and coarsely reducing interference of background to make a striking effect on performance. Second, we propose three novel methods of saliency feature extraction (SFE), simple text edge detector (STED), and simple pruning approach (SPA), to extract text candidates (edges and points) with suppressing background interference and enhancing contour of text. Also, three important principles of text feature extraction are proposed for further scene-text analysis in these researches. Then, the candidate text block generation composed of initial clusters, congregation analysis, and similarity-based congregation method is for mainly obtaining all the possible text strings. The initial clusters as separate characters are first formed by the continuous text candidates, and merged as strings within rectangular blocks by congregation conditions that are considered as intervals defining by the congregation analysis in training stage based on their density distributions of color and minimum distance. After, in the text block detection, we try to extract a feature vector with eight feature elements for each block description, and then define a confidence interval as a text feature for the very element. Thus, a novel method of discrete fuzzy linguistic intervals is proposed for this definition in our training stage. Finally, the detected blocks that contain real text strings are identified by a fuzzy weight mean operation based on their extracted feature vectors. The experimental results demonstrate that our proposed detector achieves high performance in precisely detecting scene text strings.
|