Detection and Orientation of Italic Text in Chinese OCR System
碩士 === 國立交通大學 === 資訊工程系 === 87 === Chinese OCR (Chinese Optical Character Recognition) is a well studied subject. Research institutes and manufacturers have been able to provide software with recognition rate of 95% or better. Still there are some issues to solve to make a Chi...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
1999
|
Online Access: | http://ndltd.ncl.edu.tw/handle/80925495397278572508 |
id |
ndltd-TW-087NCTU0392084 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-087NCTU03920842016-07-11T04:13:35Z http://ndltd.ncl.edu.tw/handle/80925495397278572508 Detection and Orientation of Italic Text in Chinese OCR System 在中文OCR系統中偵測並且調整斜體文字 Shih-Jin Huang 黃士晉 碩士 國立交通大學 資訊工程系 87 Chinese OCR (Chinese Optical Character Recognition) is a well studied subject. Research institutes and manufacturers have been able to provide software with recognition rate of 95% or better. Still there are some issues to solve to make a Chinese OCR system more satisfactory. Italic text handicaps almost all Chinese OCR software. It does not appear in tradition Chinese books. But it can be seen in most scientific articles, name cards, etc. This thesis study this problem. We do not try to "recognize" the characters because this is a subject very much studied and very well solved. We try to "find the italic text" and subsequently "reoriented" (i.e. transform it back to non-italic) it. We do come up with handsome results. But when the last character of an italic string touches the next non-italic character (this is not rare), we usually are unable to separate them and the result is poor. Jenn-Hann Liou 劉振漢 1999 學位論文 ; thesis 42 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 資訊工程系 === 87 === Chinese OCR (Chinese Optical Character Recognition) is a well studied subject. Research institutes and manufacturers have been able to provide software with recognition rate of 95% or better.
Still there are some issues to solve to make a Chinese OCR system more satisfactory. Italic text handicaps almost all Chinese OCR software. It does not appear in tradition Chinese books. But it can be seen in most scientific articles, name cards, etc. This thesis study this problem. We do not try to "recognize" the characters because this is a subject very much studied and very well solved. We try to "find the italic text" and subsequently "reoriented" (i.e. transform it back to non-italic) it.
We do come up with handsome results. But when the last character of an italic string touches the next non-italic character (this is not rare), we usually are unable to separate them and the result is poor.
|
author2 |
Jenn-Hann Liou |
author_facet |
Jenn-Hann Liou Shih-Jin Huang 黃士晉 |
author |
Shih-Jin Huang 黃士晉 |
spellingShingle |
Shih-Jin Huang 黃士晉 Detection and Orientation of Italic Text in Chinese OCR System |
author_sort |
Shih-Jin Huang |
title |
Detection and Orientation of Italic Text in Chinese OCR System |
title_short |
Detection and Orientation of Italic Text in Chinese OCR System |
title_full |
Detection and Orientation of Italic Text in Chinese OCR System |
title_fullStr |
Detection and Orientation of Italic Text in Chinese OCR System |
title_full_unstemmed |
Detection and Orientation of Italic Text in Chinese OCR System |
title_sort |
detection and orientation of italic text in chinese ocr system |
publishDate |
1999 |
url |
http://ndltd.ncl.edu.tw/handle/80925495397278572508 |
work_keys_str_mv |
AT shihjinhuang detectionandorientationofitalictextinchineseocrsystem AT huángshìjìn detectionandorientationofitalictextinchineseocrsystem AT shihjinhuang zàizhōngwénocrxìtǒngzhōngzhēncèbìngqiědiàozhěngxiétǐwénzì AT huángshìjìn zàizhōngwénocrxìtǒngzhōngzhēncèbìngqiědiàozhěngxiétǐwénzì |
_version_ |
1718343393174618112 |