A Study of Multiple Classifier Systems in Automated Text Categorization
碩士 === 國立中正大學 === 資訊工程研究所 === 90 === Automatic text categorization, which is defined as the task of assigning predefined class (category) labels to text documents, is one of the main techniques that are useful both in organizing and in locating information in huge text collections from, f...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2002
|
Online Access: | http://ndltd.ncl.edu.tw/handle/58157330409643777309 |
id |
ndltd-TW-090CCU00392006 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-090CCU003920062015-10-13T17:34:57Z http://ndltd.ncl.edu.tw/handle/58157330409643777309 A Study of Multiple Classifier Systems in Automated Text Categorization 多分類器系統在自動化文件分類之研究 Yuan-Gu Wei 魏源谷 碩士 國立中正大學 資訊工程研究所 90 Automatic text categorization, which is defined as the task of assigning predefined class (category) labels to text documents, is one of the main techniques that are useful both in organizing and in locating information in huge text collections from, for example, the Internet. Many approaches such as linear classifiers, decision trees, Bayesian methods, neural networks and support vector machines, have been extensively studied and used to implement classifier systems for text categorization as well as for web page classification. Although a lot of efforts have been spent in each of these methods, we are reaching the limit of further performance improvement. Multiple classifier systems whose objective aims to combine the strength of individual classifiers to improve overall performance, have been widely studied recently. In this thesis, we study the development of multiple classifier systems in the automated text categorization. We investigate and propose various approaches for fundamental issues such as classifier combination, classifier subset selection, and static and dynamic classifier selection. We use our idea to develop efficient combination-based as well as selection-based multiple classifier systems. Experiments show that our approaches significantly improves the classification accuracy of individual classifiers for web page collections from web portals. In addition, we also propose a cascaded class reduction method in which a sequence of classifiers are cascaded to successively reducing the set of possible classes. We show that by cascading Naive Bayes and SVMs, we can improve the classification accuracy of SVMs while reducing the running time of SVMs. Jyh-Jong Tsay 蔡志忠 2002 學位論文 ; thesis 108 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中正大學 === 資訊工程研究所 === 90 === Automatic text categorization, which is defined as the task of
assigning predefined class (category) labels to text documents, is one of the main techniques that are useful both in organizing and in locating information in huge text collections from, for example, the Internet. Many approaches such as linear classifiers, decision trees, Bayesian methods, neural networks and support vector machines, have been extensively studied and used to implement classifier systems for text categorization as well as for web page classification. Although a lot of efforts have been spent in each of these methods, we are reaching the limit of further performance improvement. Multiple classifier systems whose objective aims to combine the strength of individual classifiers to improve overall performance, have been widely studied recently.
In this thesis, we study the development of multiple classifier
systems in the automated text categorization. We investigate and
propose various approaches for fundamental issues such as
classifier combination, classifier subset selection, and static
and dynamic classifier selection. We use our idea to develop
efficient combination-based as well as selection-based multiple
classifier systems. Experiments show that our approaches
significantly improves the classification accuracy of individual
classifiers for web page collections from web portals. In
addition, we also propose a cascaded class reduction method in
which a sequence of classifiers are cascaded to successively
reducing the set of possible classes. We show that by cascading
Naive Bayes and SVMs, we can improve the classification accuracy
of SVMs while reducing the running time of SVMs.
|
author2 |
Jyh-Jong Tsay |
author_facet |
Jyh-Jong Tsay Yuan-Gu Wei 魏源谷 |
author |
Yuan-Gu Wei 魏源谷 |
spellingShingle |
Yuan-Gu Wei 魏源谷 A Study of Multiple Classifier Systems in Automated Text Categorization |
author_sort |
Yuan-Gu Wei |
title |
A Study of Multiple Classifier Systems in Automated Text Categorization |
title_short |
A Study of Multiple Classifier Systems in Automated Text Categorization |
title_full |
A Study of Multiple Classifier Systems in Automated Text Categorization |
title_fullStr |
A Study of Multiple Classifier Systems in Automated Text Categorization |
title_full_unstemmed |
A Study of Multiple Classifier Systems in Automated Text Categorization |
title_sort |
study of multiple classifier systems in automated text categorization |
publishDate |
2002 |
url |
http://ndltd.ncl.edu.tw/handle/58157330409643777309 |
work_keys_str_mv |
AT yuanguwei astudyofmultipleclassifiersystemsinautomatedtextcategorization AT wèiyuángǔ astudyofmultipleclassifiersystemsinautomatedtextcategorization AT yuanguwei duōfēnlèiqìxìtǒngzàizìdònghuàwénjiànfēnlèizhīyánjiū AT wèiyuángǔ duōfēnlèiqìxìtǒngzàizìdònghuàwénjiànfēnlèizhīyánjiū AT yuanguwei studyofmultipleclassifiersystemsinautomatedtextcategorization AT wèiyuángǔ studyofmultipleclassifiersystemsinautomatedtextcategorization |
_version_ |
1717782067383631872 |