Detection and Orientation of Italic Text in Chinese OCR System

碩士 === 國立交通大學 === 資訊工程系 === 87 === Chinese OCR (Chinese Optical Character Recognition) is a well studied subject. Research institutes and manufacturers have been able to provide software with recognition rate of 95% or better. Still there are some issues to solve to make a Chi...

Full description

Bibliographic Details
Main Authors: Shih-Jin Huang, 黃士晉
Other Authors: Jenn-Hann Liou
Format: Others
Language:zh-TW
Published: 1999
Online Access:http://ndltd.ncl.edu.tw/handle/80925495397278572508
id ndltd-TW-087NCTU0392084
record_format oai_dc
spelling ndltd-TW-087NCTU03920842016-07-11T04:13:35Z http://ndltd.ncl.edu.tw/handle/80925495397278572508 Detection and Orientation of Italic Text in Chinese OCR System 在中文OCR系統中偵測並且調整斜體文字 Shih-Jin Huang 黃士晉 碩士 國立交通大學 資訊工程系 87 Chinese OCR (Chinese Optical Character Recognition) is a well studied subject. Research institutes and manufacturers have been able to provide software with recognition rate of 95% or better. Still there are some issues to solve to make a Chinese OCR system more satisfactory. Italic text handicaps almost all Chinese OCR software. It does not appear in tradition Chinese books. But it can be seen in most scientific articles, name cards, etc. This thesis study this problem. We do not try to "recognize" the characters because this is a subject very much studied and very well solved. We try to "find the italic text" and subsequently "reoriented" (i.e. transform it back to non-italic) it. We do come up with handsome results. But when the last character of an italic string touches the next non-italic character (this is not rare), we usually are unable to separate them and the result is poor. Jenn-Hann Liou 劉振漢 1999 學位論文 ; thesis 42 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 資訊工程系 === 87 === Chinese OCR (Chinese Optical Character Recognition) is a well studied subject. Research institutes and manufacturers have been able to provide software with recognition rate of 95% or better. Still there are some issues to solve to make a Chinese OCR system more satisfactory. Italic text handicaps almost all Chinese OCR software. It does not appear in tradition Chinese books. But it can be seen in most scientific articles, name cards, etc. This thesis study this problem. We do not try to "recognize" the characters because this is a subject very much studied and very well solved. We try to "find the italic text" and subsequently "reoriented" (i.e. transform it back to non-italic) it. We do come up with handsome results. But when the last character of an italic string touches the next non-italic character (this is not rare), we usually are unable to separate them and the result is poor.
author2 Jenn-Hann Liou
author_facet Jenn-Hann Liou
Shih-Jin Huang
黃士晉
author Shih-Jin Huang
黃士晉
spellingShingle Shih-Jin Huang
黃士晉
Detection and Orientation of Italic Text in Chinese OCR System
author_sort Shih-Jin Huang
title Detection and Orientation of Italic Text in Chinese OCR System
title_short Detection and Orientation of Italic Text in Chinese OCR System
title_full Detection and Orientation of Italic Text in Chinese OCR System
title_fullStr Detection and Orientation of Italic Text in Chinese OCR System
title_full_unstemmed Detection and Orientation of Italic Text in Chinese OCR System
title_sort detection and orientation of italic text in chinese ocr system
publishDate 1999
url http://ndltd.ncl.edu.tw/handle/80925495397278572508
work_keys_str_mv AT shihjinhuang detectionandorientationofitalictextinchineseocrsystem
AT huángshìjìn detectionandorientationofitalictextinchineseocrsystem
AT shihjinhuang zàizhōngwénocrxìtǒngzhōngzhēncèbìngqiědiàozhěngxiétǐwénzì
AT huángshìjìn zàizhōngwénocrxìtǒngzhōngzhēncèbìngqiědiàozhěngxiétǐwénzì
_version_ 1718343393174618112