Data Exploration by Using the Monotonicity Property

Dealing with different misclassification costs has been a big problem for classification. Some algorithms can predict quite accurately when assuming the misclassification costs for each class are the same, like most rule induction methods. However, when the misclassification costs change, which is a...

Full description

Bibliographic Details
Main Author:	Chen, Hongyi
Other Authors:	Warren Liao
Format:	Others
Language:	en
Published:	LSU 2008
Subjects:	Computer Science
Online Access:	http://etd.lsu.edu/docs/available/etd-05292008-103454/

id	ndltd-LSU-oai-etd.lsu.edu-etd-05292008-103454
record_format	oai_dc
spelling	ndltd-LSU-oai-etd.lsu.edu-etd-05292008-1034542013-01-07T22:51:47Z Data Exploration by Using the Monotonicity Property Chen, Hongyi Computer Science Dealing with different misclassification costs has been a big problem for classification. Some algorithms can predict quite accurately when assuming the misclassification costs for each class are the same, like most rule induction methods. However, when the misclassification costs change, which is a common phenomenon in reality, these algorithms are not capable of adjusting their results. Some other algorithms, like the Bayesian methods, have the ability to yield probabilities of a certain unclassified example belonging to given classes, which is helpful to make modification on the results according to different misclassification costs. The shortcoming of such algorithms is, when the misclassification costs for each class are the same, they do not generate the most accurate results. This thesis attempts to incorporate the merits of both kinds of algorithms into one. That is, to develop a new algorithm which can predict relatively accurately and can adjust to the change of misclassification costs. The strategy of the new algorithm is to create a weighted voting system. A weighted voting system will evaluate the evidence of the new example belonging to each class, calculate the assessment of probabilities for the example, and assign the example to a certain class according to the probabilities as well as the misclassification costs. The main problem of creating a weighted voting system is to decide the optimal weights of the individual votes. To solve this problem, we will mainly refer to the monotonicity property. People have found the monotonicity property does not only exist in pure monotone systems, but also exists in non-monotone systems. Since the study of the monotonicity property has been a huge success on monotone systems, it is only natural to apply the monotonicity property to non-monotone systems too. This thesis deals only with binary systems. Though such systems hardly exist in practice, this treatment provides concrete ideas for the development of general solution algorithms. After the final algorithm has been formulated, it has been tested on a wide range of randomly generated synthetic datasets. It has also been compared with other existing classifiers. The results indicate this algorithm performs both effectively and efficiently. Warren Liao Jianhua Chen Evangelos Triantaphyllou LSU 2008-06-10 text application/pdf http://etd.lsu.edu/docs/available/etd-05292008-103454/ http://etd.lsu.edu/docs/available/etd-05292008-103454/ en unrestricted I hereby certify that, if appropriate, I have obtained and attached herein a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to LSU or its agents the non-exclusive license to archive and make accessible, under the conditions specified below and in appropriate University policies, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.
collection	NDLTD
language	en
format	Others
sources	NDLTD
topic	Computer Science
spellingShingle	Computer Science Chen, Hongyi Data Exploration by Using the Monotonicity Property
description	Dealing with different misclassification costs has been a big problem for classification. Some algorithms can predict quite accurately when assuming the misclassification costs for each class are the same, like most rule induction methods. However, when the misclassification costs change, which is a common phenomenon in reality, these algorithms are not capable of adjusting their results. Some other algorithms, like the Bayesian methods, have the ability to yield probabilities of a certain unclassified example belonging to given classes, which is helpful to make modification on the results according to different misclassification costs. The shortcoming of such algorithms is, when the misclassification costs for each class are the same, they do not generate the most accurate results. This thesis attempts to incorporate the merits of both kinds of algorithms into one. That is, to develop a new algorithm which can predict relatively accurately and can adjust to the change of misclassification costs. The strategy of the new algorithm is to create a weighted voting system. A weighted voting system will evaluate the evidence of the new example belonging to each class, calculate the assessment of probabilities for the example, and assign the example to a certain class according to the probabilities as well as the misclassification costs. The main problem of creating a weighted voting system is to decide the optimal weights of the individual votes. To solve this problem, we will mainly refer to the monotonicity property. People have found the monotonicity property does not only exist in pure monotone systems, but also exists in non-monotone systems. Since the study of the monotonicity property has been a huge success on monotone systems, it is only natural to apply the monotonicity property to non-monotone systems too. This thesis deals only with binary systems. Though such systems hardly exist in practice, this treatment provides concrete ideas for the development of general solution algorithms. After the final algorithm has been formulated, it has been tested on a wide range of randomly generated synthetic datasets. It has also been compared with other existing classifiers. The results indicate this algorithm performs both effectively and efficiently.
author2	Warren Liao
author_facet	Warren Liao Chen, Hongyi
author	Chen, Hongyi
author_sort	Chen, Hongyi
title	Data Exploration by Using the Monotonicity Property
title_short	Data Exploration by Using the Monotonicity Property
title_full	Data Exploration by Using the Monotonicity Property
title_fullStr	Data Exploration by Using the Monotonicity Property
title_full_unstemmed	Data Exploration by Using the Monotonicity Property
title_sort	data exploration by using the monotonicity property
publisher	LSU
publishDate	2008
url	http://etd.lsu.edu/docs/available/etd-05292008-103454/
work_keys_str_mv	AT chenhongyi dataexplorationbyusingthemonotonicityproperty
_version_	1716477289414262784

Data Exploration by Using the Monotonicity Property

Similar Items