A Study of Multiple Classifier Systems in Automated Text Categorization

碩士 === 國立中正大學 === 資訊工程研究所 === 90 === Automatic text categorization, which is defined as the task of assigning predefined class (category) labels to text documents, is one of the main techniques that are useful both in organizing and in locating information in huge text collections from, f...

Full description

Bibliographic Details
Main Authors: Yuan-Gu Wei, 魏源谷
Other Authors: Jyh-Jong Tsay
Format: Others
Language:en_US
Published: 2002
Online Access:http://ndltd.ncl.edu.tw/handle/58157330409643777309
id ndltd-TW-090CCU00392006
record_format oai_dc
spelling ndltd-TW-090CCU003920062015-10-13T17:34:57Z http://ndltd.ncl.edu.tw/handle/58157330409643777309 A Study of Multiple Classifier Systems in Automated Text Categorization 多分類器系統在自動化文件分類之研究 Yuan-Gu Wei 魏源谷 碩士 國立中正大學 資訊工程研究所 90 Automatic text categorization, which is defined as the task of assigning predefined class (category) labels to text documents, is one of the main techniques that are useful both in organizing and in locating information in huge text collections from, for example, the Internet. Many approaches such as linear classifiers, decision trees, Bayesian methods, neural networks and support vector machines, have been extensively studied and used to implement classifier systems for text categorization as well as for web page classification. Although a lot of efforts have been spent in each of these methods, we are reaching the limit of further performance improvement. Multiple classifier systems whose objective aims to combine the strength of individual classifiers to improve overall performance, have been widely studied recently. In this thesis, we study the development of multiple classifier systems in the automated text categorization. We investigate and propose various approaches for fundamental issues such as classifier combination, classifier subset selection, and static and dynamic classifier selection. We use our idea to develop efficient combination-based as well as selection-based multiple classifier systems. Experiments show that our approaches significantly improves the classification accuracy of individual classifiers for web page collections from web portals. In addition, we also propose a cascaded class reduction method in which a sequence of classifiers are cascaded to successively reducing the set of possible classes. We show that by cascading Naive Bayes and SVMs, we can improve the classification accuracy of SVMs while reducing the running time of SVMs. Jyh-Jong Tsay 蔡志忠 2002 學位論文 ; thesis 108 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中正大學 === 資訊工程研究所 === 90 === Automatic text categorization, which is defined as the task of assigning predefined class (category) labels to text documents, is one of the main techniques that are useful both in organizing and in locating information in huge text collections from, for example, the Internet. Many approaches such as linear classifiers, decision trees, Bayesian methods, neural networks and support vector machines, have been extensively studied and used to implement classifier systems for text categorization as well as for web page classification. Although a lot of efforts have been spent in each of these methods, we are reaching the limit of further performance improvement. Multiple classifier systems whose objective aims to combine the strength of individual classifiers to improve overall performance, have been widely studied recently. In this thesis, we study the development of multiple classifier systems in the automated text categorization. We investigate and propose various approaches for fundamental issues such as classifier combination, classifier subset selection, and static and dynamic classifier selection. We use our idea to develop efficient combination-based as well as selection-based multiple classifier systems. Experiments show that our approaches significantly improves the classification accuracy of individual classifiers for web page collections from web portals. In addition, we also propose a cascaded class reduction method in which a sequence of classifiers are cascaded to successively reducing the set of possible classes. We show that by cascading Naive Bayes and SVMs, we can improve the classification accuracy of SVMs while reducing the running time of SVMs.
author2 Jyh-Jong Tsay
author_facet Jyh-Jong Tsay
Yuan-Gu Wei
魏源谷
author Yuan-Gu Wei
魏源谷
spellingShingle Yuan-Gu Wei
魏源谷
A Study of Multiple Classifier Systems in Automated Text Categorization
author_sort Yuan-Gu Wei
title A Study of Multiple Classifier Systems in Automated Text Categorization
title_short A Study of Multiple Classifier Systems in Automated Text Categorization
title_full A Study of Multiple Classifier Systems in Automated Text Categorization
title_fullStr A Study of Multiple Classifier Systems in Automated Text Categorization
title_full_unstemmed A Study of Multiple Classifier Systems in Automated Text Categorization
title_sort study of multiple classifier systems in automated text categorization
publishDate 2002
url http://ndltd.ncl.edu.tw/handle/58157330409643777309
work_keys_str_mv AT yuanguwei astudyofmultipleclassifiersystemsinautomatedtextcategorization
AT wèiyuángǔ astudyofmultipleclassifiersystemsinautomatedtextcategorization
AT yuanguwei duōfēnlèiqìxìtǒngzàizìdònghuàwénjiànfēnlèizhīyánjiū
AT wèiyuángǔ duōfēnlèiqìxìtǒngzàizìdònghuàwénjiànfēnlèizhīyánjiū
AT yuanguwei studyofmultipleclassifiersystemsinautomatedtextcategorization
AT wèiyuángǔ studyofmultipleclassifiersystemsinautomatedtextcategorization
_version_ 1717782067383631872