Detection and Orientation of Italic Text in Chinese OCR System
碩士 === 國立交通大學 === 資訊工程系 === 87 === Chinese OCR (Chinese Optical Character Recognition) is a well studied subject. Research institutes and manufacturers have been able to provide software with recognition rate of 95% or better. Still there are some issues to solve to make a Chi...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
1999
|
Online Access: | http://ndltd.ncl.edu.tw/handle/80925495397278572508 |
Summary: | 碩士 === 國立交通大學 === 資訊工程系 === 87 === Chinese OCR (Chinese Optical Character Recognition) is a well studied subject. Research institutes and manufacturers have been able to provide software with recognition rate of 95% or better.
Still there are some issues to solve to make a Chinese OCR system more satisfactory. Italic text handicaps almost all Chinese OCR software. It does not appear in tradition Chinese books. But it can be seen in most scientific articles, name cards, etc. This thesis study this problem. We do not try to "recognize" the characters because this is a subject very much studied and very well solved. We try to "find the italic text" and subsequently "reoriented" (i.e. transform it back to non-italic) it.
We do come up with handsome results. But when the last character of an italic string touches the next non-italic character (this is not rare), we usually are unable to separate them and the result is poor.
|
---|