雲端筆記之混合式文字切割與辨識

文字辨識為常見的電腦視覺應用之一,隨著正確率逐漸的上升,許多新的服務相繼出現,本論文改善了筆記管理軟體最主要的問題-文字切割,並提出兩種新的中文印刷體及手寫體的分類方法。我們將筆記文件中較常見的重點標記過濾後,再使用新核心的文字結構濾波取得筆記文件中的文字區塊,新的核心數據大幅降低原始核心的計算時間。本論文也使用文字結構濾波作為分辨印刷體、手寫體的特徵值,由於文字結構濾波會依據筆畫結構給予能量回饋,使得較工整的印刷體與手寫體能有所區別,此外也使用Sobel搭配不同角度範圍進行字體辨識,實驗結果證實了本論文所提出的文字切割及字體分類方法對於筆記文件資訊的處理是有效的。 === Character...

Full description

Bibliographic Details
Main Authors: 王冠智, Wang, Guan Jhih
Language:中文
Published: 國立政治大學
Subjects:
Online Access:http://thesis.lib.nccu.edu.tw/cgi-bin/cdrfb3/gsweb.cgi?o=dstdcdr&i=sid=%22G0099753003%22.
Description
Summary:文字辨識為常見的電腦視覺應用之一,隨著正確率逐漸的上升,許多新的服務相繼出現,本論文改善了筆記管理軟體最主要的問題-文字切割,並提出兩種新的中文印刷體及手寫體的分類方法。我們將筆記文件中較常見的重點標記過濾後,再使用新核心的文字結構濾波取得筆記文件中的文字區塊,新的核心數據大幅降低原始核心的計算時間。本論文也使用文字結構濾波作為分辨印刷體、手寫體的特徵值,由於文字結構濾波會依據筆畫結構給予能量回饋,使得較工整的印刷體與手寫體能有所區別,此外也使用Sobel搭配不同角度範圍進行字體辨識,實驗結果證實了本論文所提出的文字切割及字體分類方法對於筆記文件資訊的處理是有效的。 === Character recognition is an important and practical application of computer vision. With the advance of this technology, more and more services embedding text recognition functionality have become available. However, segmentation is still the central issue in many situations. In this thesis, we tackle the character segmentation problem in note taking and management applications. We propose novel methods for the discrimination of handwritten and machine-printed Chinese characters. First, we perform noise removal using heuristics and apply a stroke filter with modified kernels to efficiently compute the bounding box for the text area. The responses of the stroke filter also serve as clues for differentiating machine-printed and handwritten texts. They are further enhanced using a SVM-based classifier that employs aggregated directional responses of edge detectors as input. Experiment results have validated the efficacy of the proposed approaches in terms of text localization and style recognition.