Comparing Representations for Chinese Text Categorization

碩士 === 國立中正大學 === 資訊工程研究所 === 89 === In this thesis, we study the effects of various representations of Chinese documents for automatic text categorization. We make a comparison for word-based and n-gram-based representations when they are combined with weighting factors, such as term frequency(TF),...

Full description

Bibliographic Details
Main Authors: Sheng-Bin Chiu, 邱聖斌
Other Authors: Jyh-Jong Tsay
Format: Others
Language:en_US
Published: 2001
Online Access:http://ndltd.ncl.edu.tw/handle/54063113676904681277
id ndltd-TW-089CCU00392021
record_format oai_dc
spelling ndltd-TW-089CCU003920212016-07-06T04:09:53Z http://ndltd.ncl.edu.tw/handle/54063113676904681277 Comparing Representations for Chinese Text Categorization 中文文件表示法在文件分類中之比較 Sheng-Bin Chiu 邱聖斌 碩士 國立中正大學 資訊工程研究所 89 In this thesis, we study the effects of various representations of Chinese documents for automatic text categorization. We make a comparison for word-based and n-gram-based representations when they are combined with weighting factors, such as term frequency(TF), inverse document frequency(IDF) and inverse class frequency(ICF). Experiment on CNA news collection shows that bigram achieves performance close to that of statistical word-based representations, and weighting methods that combine TF, IDF and ICF achieve the best performance. Jyh-Jong Tsay 蔡志忠 2001 學位論文 ; thesis 38 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中正大學 === 資訊工程研究所 === 89 === In this thesis, we study the effects of various representations of Chinese documents for automatic text categorization. We make a comparison for word-based and n-gram-based representations when they are combined with weighting factors, such as term frequency(TF), inverse document frequency(IDF) and inverse class frequency(ICF). Experiment on CNA news collection shows that bigram achieves performance close to that of statistical word-based representations, and weighting methods that combine TF, IDF and ICF achieve the best performance.
author2 Jyh-Jong Tsay
author_facet Jyh-Jong Tsay
Sheng-Bin Chiu
邱聖斌
author Sheng-Bin Chiu
邱聖斌
spellingShingle Sheng-Bin Chiu
邱聖斌
Comparing Representations for Chinese Text Categorization
author_sort Sheng-Bin Chiu
title Comparing Representations for Chinese Text Categorization
title_short Comparing Representations for Chinese Text Categorization
title_full Comparing Representations for Chinese Text Categorization
title_fullStr Comparing Representations for Chinese Text Categorization
title_full_unstemmed Comparing Representations for Chinese Text Categorization
title_sort comparing representations for chinese text categorization
publishDate 2001
url http://ndltd.ncl.edu.tw/handle/54063113676904681277
work_keys_str_mv AT shengbinchiu comparingrepresentationsforchinesetextcategorization
AT qiūshèngbīn comparingrepresentationsforchinesetextcategorization
AT shengbinchiu zhōngwénwénjiànbiǎoshìfǎzàiwénjiànfēnlèizhōngzhībǐjiào
AT qiūshèngbīn zhōngwénwénjiànbiǎoshìfǎzàiwénjiànfēnlèizhōngzhībǐjiào
_version_ 1718336431424798720