Comparing Representations for Chinese Text Categorization
碩士 === 國立中正大學 === 資訊工程研究所 === 89 === In this thesis, we study the effects of various representations of Chinese documents for automatic text categorization. We make a comparison for word-based and n-gram-based representations when they are combined with weighting factors, such as term frequency(TF),...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2001
|
Online Access: | http://ndltd.ncl.edu.tw/handle/54063113676904681277 |
id |
ndltd-TW-089CCU00392021 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-089CCU003920212016-07-06T04:09:53Z http://ndltd.ncl.edu.tw/handle/54063113676904681277 Comparing Representations for Chinese Text Categorization 中文文件表示法在文件分類中之比較 Sheng-Bin Chiu 邱聖斌 碩士 國立中正大學 資訊工程研究所 89 In this thesis, we study the effects of various representations of Chinese documents for automatic text categorization. We make a comparison for word-based and n-gram-based representations when they are combined with weighting factors, such as term frequency(TF), inverse document frequency(IDF) and inverse class frequency(ICF). Experiment on CNA news collection shows that bigram achieves performance close to that of statistical word-based representations, and weighting methods that combine TF, IDF and ICF achieve the best performance. Jyh-Jong Tsay 蔡志忠 2001 學位論文 ; thesis 38 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中正大學 === 資訊工程研究所 === 89 === In this thesis, we study the effects of various representations of Chinese documents for automatic text categorization. We make a comparison for word-based and n-gram-based representations when they are combined with weighting factors, such as term frequency(TF), inverse document frequency(IDF) and inverse class frequency(ICF). Experiment on CNA news collection shows that bigram achieves performance close to that of statistical word-based representations, and weighting methods that combine
TF, IDF and ICF achieve the best performance.
|
author2 |
Jyh-Jong Tsay |
author_facet |
Jyh-Jong Tsay Sheng-Bin Chiu 邱聖斌 |
author |
Sheng-Bin Chiu 邱聖斌 |
spellingShingle |
Sheng-Bin Chiu 邱聖斌 Comparing Representations for Chinese Text Categorization |
author_sort |
Sheng-Bin Chiu |
title |
Comparing Representations for Chinese Text Categorization |
title_short |
Comparing Representations for Chinese Text Categorization |
title_full |
Comparing Representations for Chinese Text Categorization |
title_fullStr |
Comparing Representations for Chinese Text Categorization |
title_full_unstemmed |
Comparing Representations for Chinese Text Categorization |
title_sort |
comparing representations for chinese text categorization |
publishDate |
2001 |
url |
http://ndltd.ncl.edu.tw/handle/54063113676904681277 |
work_keys_str_mv |
AT shengbinchiu comparingrepresentationsforchinesetextcategorization AT qiūshèngbīn comparingrepresentationsforchinesetextcategorization AT shengbinchiu zhōngwénwénjiànbiǎoshìfǎzàiwénjiànfēnlèizhōngzhībǐjiào AT qiūshèngbīn zhōngwénwénjiànbiǎoshìfǎzàiwénjiànfēnlèizhōngzhībǐjiào |
_version_ |
1718336431424798720 |