A Study of Multiple Classifier Systems in Automated Text Categorization

碩士 === 國立中正大學 === 資訊工程研究所 === 90 === Automatic text categorization, which is defined as the task of assigning predefined class (category) labels to text documents, is one of the main techniques that are useful both in organizing and in locating information in huge text collections from, f...

Full description

Bibliographic Details
Main Authors:	Yuan-Gu Wei, 魏源谷
Other Authors:	Jyh-Jong Tsay
Format:	Others
Language:	en_US
Published:	2002
Online Access:	http://ndltd.ncl.edu.tw/handle/58157330409643777309

id	ndltd-TW-090CCU00392006
record_format	oai_dc
spelling	ndltd-TW-090CCU003920062015-10-13T17:34:57Z http://ndltd.ncl.edu.tw/handle/58157330409643777309 A Study of Multiple Classifier Systems in Automated Text Categorization 多分類器系統在自動化文件分類之研究 Yuan-Gu Wei 魏源谷碩士國立中正大學資訊工程研究所 90 Automatic text categorization, which is defined as the task of assigning predefined class (category) labels to text documents, is one of the main techniques that are useful both in organizing and in locating information in huge text collections from, for example, the Internet. Many approaches such as linear classifiers, decision trees, Bayesian methods, neural networks and support vector machines, have been extensively studied and used to implement classifier systems for text categorization as well as for web page classification. Although a lot of efforts have been spent in each of these methods, we are reaching the limit of further performance improvement. Multiple classifier systems whose objective aims to combine the strength of individual classifiers to improve overall performance, have been widely studied recently. In this thesis, we study the development of multiple classifier systems in the automated text categorization. We investigate and propose various approaches for fundamental issues such as classifier combination, classifier subset selection, and static and dynamic classifier selection. We use our idea to develop efficient combination-based as well as selection-based multiple classifier systems. Experiments show that our approaches significantly improves the classification accuracy of individual classifiers for web page collections from web portals. In addition, we also propose a cascaded class reduction method in which a sequence of classifiers are cascaded to successively reducing the set of possible classes. We show that by cascading Naive Bayes and SVMs, we can improve the classification accuracy of SVMs while reducing the running time of SVMs. Jyh-Jong Tsay 蔡志忠 2002 學位論文 ; thesis 108 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立中正大學 === 資訊工程研究所 === 90 === Automatic text categorization, which is defined as the task of assigning predefined class (category) labels to text documents, is one of the main techniques that are useful both in organizing and in locating information in huge text collections from, for example, the Internet. Many approaches such as linear classifiers, decision trees, Bayesian methods, neural networks and support vector machines, have been extensively studied and used to implement classifier systems for text categorization as well as for web page classification. Although a lot of efforts have been spent in each of these methods, we are reaching the limit of further performance improvement. Multiple classifier systems whose objective aims to combine the strength of individual classifiers to improve overall performance, have been widely studied recently. In this thesis, we study the development of multiple classifier systems in the automated text categorization. We investigate and propose various approaches for fundamental issues such as classifier combination, classifier subset selection, and static and dynamic classifier selection. We use our idea to develop efficient combination-based as well as selection-based multiple classifier systems. Experiments show that our approaches significantly improves the classification accuracy of individual classifiers for web page collections from web portals. In addition, we also propose a cascaded class reduction method in which a sequence of classifiers are cascaded to successively reducing the set of possible classes. We show that by cascading Naive Bayes and SVMs, we can improve the classification accuracy of SVMs while reducing the running time of SVMs.
author2	Jyh-Jong Tsay
author_facet	Jyh-Jong Tsay Yuan-Gu Wei 魏源谷
author	Yuan-Gu Wei 魏源谷
spellingShingle	Yuan-Gu Wei 魏源谷 A Study of Multiple Classifier Systems in Automated Text Categorization
author_sort	Yuan-Gu Wei
title	A Study of Multiple Classifier Systems in Automated Text Categorization
title_short	A Study of Multiple Classifier Systems in Automated Text Categorization
title_full	A Study of Multiple Classifier Systems in Automated Text Categorization
title_fullStr	A Study of Multiple Classifier Systems in Automated Text Categorization
title_full_unstemmed	A Study of Multiple Classifier Systems in Automated Text Categorization
title_sort	study of multiple classifier systems in automated text categorization
publishDate	2002
url	http://ndltd.ncl.edu.tw/handle/58157330409643777309
work_keys_str_mv	AT yuanguwei astudyofmultipleclassifiersystemsinautomatedtextcategorization AT wèiyuángǔ astudyofmultipleclassifiersystemsinautomatedtextcategorization AT yuanguwei duōfēnlèiqìxìtǒngzàizìdònghuàwénjiànfēnlèizhīyánjiū AT wèiyuángǔ duōfēnlèiqìxìtǒngzàizìdònghuàwénjiànfēnlèizhīyánjiū AT yuanguwei studyofmultipleclassifiersystemsinautomatedtextcategorization AT wèiyuángǔ studyofmultipleclassifiersystemsinautomatedtextcategorization
_version_	1717782067383631872

A Study of Multiple Classifier Systems in Automated Text Categorization

Similar Items