Two Stages Text Classification via Smooth Support Vector Machine

碩士 === 國立臺灣科技大學 === 資訊工程系 === 93 === In recent years, text classification is one of the popular research topics. The technique can be extensively applied to many cases such as information retrieval, email filtering, topic-based retrieval, and personalization of search engines, etc. In this thesis, w...

Full description

Bibliographic Details
Main Authors: Chih-Hung Tsai, 蔡至泓
Other Authors: Yuh-Jye Lee
Format: Others
Language:zh-TW
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/72724320427534565170
id ndltd-TW-093NTUST392034
record_format oai_dc
spelling ndltd-TW-093NTUST3920342016-06-08T04:13:17Z http://ndltd.ncl.edu.tw/handle/72724320427534565170 Two Stages Text Classification via Smooth Support Vector Machine 利用平滑支撐向量法架構階層式文件分類器 Chih-Hung Tsai 蔡至泓 碩士 國立臺灣科技大學 資訊工程系 93 In recent years, text classification is one of the popular research topics. The technique can be extensively applied to many cases such as information retrieval, email filtering, topic-based retrieval, and personalization of search engines, etc. In this thesis, we propose a two stages linear classifier framework using smooth support vector machine (SSVM) for text classification. This framework contains two classifiers. The first classifier is generated by SSVM and b-tuning procedure. Its main purpose is to increase the recall. The second classifier tries to enhance the performance of the first classifier. We also try several different feature selection methods, mutual information (MI), χ2-test and DF-Entropy, to select terms as features, and to compare their classification results. It turns out that the DF-Entropy method has the best performance among these methods. Besides, we also utilize linear reduced support vector machine (RSVM) for text classification. This approach is to use linear kernel to replace non-linear kernel in RSVM. Its advantages are that it needs not do feature selection and can reduce computational time. From the comparison results, two stages linear classifier has the best performance and linear RSVM also has good performance. Yuh-Jye Lee 李育杰 2005 學位論文 ; thesis 57 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣科技大學 === 資訊工程系 === 93 === In recent years, text classification is one of the popular research topics. The technique can be extensively applied to many cases such as information retrieval, email filtering, topic-based retrieval, and personalization of search engines, etc. In this thesis, we propose a two stages linear classifier framework using smooth support vector machine (SSVM) for text classification. This framework contains two classifiers. The first classifier is generated by SSVM and b-tuning procedure. Its main purpose is to increase the recall. The second classifier tries to enhance the performance of the first classifier. We also try several different feature selection methods, mutual information (MI), χ2-test and DF-Entropy, to select terms as features, and to compare their classification results. It turns out that the DF-Entropy method has the best performance among these methods. Besides, we also utilize linear reduced support vector machine (RSVM) for text classification. This approach is to use linear kernel to replace non-linear kernel in RSVM. Its advantages are that it needs not do feature selection and can reduce computational time. From the comparison results, two stages linear classifier has the best performance and linear RSVM also has good performance.
author2 Yuh-Jye Lee
author_facet Yuh-Jye Lee
Chih-Hung Tsai
蔡至泓
author Chih-Hung Tsai
蔡至泓
spellingShingle Chih-Hung Tsai
蔡至泓
Two Stages Text Classification via Smooth Support Vector Machine
author_sort Chih-Hung Tsai
title Two Stages Text Classification via Smooth Support Vector Machine
title_short Two Stages Text Classification via Smooth Support Vector Machine
title_full Two Stages Text Classification via Smooth Support Vector Machine
title_fullStr Two Stages Text Classification via Smooth Support Vector Machine
title_full_unstemmed Two Stages Text Classification via Smooth Support Vector Machine
title_sort two stages text classification via smooth support vector machine
publishDate 2005
url http://ndltd.ncl.edu.tw/handle/72724320427534565170
work_keys_str_mv AT chihhungtsai twostagestextclassificationviasmoothsupportvectormachine
AT càizhìhóng twostagestextclassificationviasmoothsupportvectormachine
AT chihhungtsai lìyòngpínghuázhīchēngxiàngliàngfǎjiàgòujiēcéngshìwénjiànfēnlèiqì
AT càizhìhóng lìyòngpínghuázhīchēngxiàngliàngfǎjiàgòujiēcéngshìwénjiànfēnlèiqì
_version_ 1718296967459635200