Data Exploration by Using the Monotonicity Property
Dealing with different misclassification costs has been a big problem for classification. Some algorithms can predict quite accurately when assuming the misclassification costs for each class are the same, like most rule induction methods. However, when the misclassification costs change, which is a...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | en |
Published: |
LSU
2008
|
Subjects: | |
Online Access: | http://etd.lsu.edu/docs/available/etd-05292008-103454/ |
id |
ndltd-LSU-oai-etd.lsu.edu-etd-05292008-103454 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-LSU-oai-etd.lsu.edu-etd-05292008-1034542013-01-07T22:51:47Z Data Exploration by Using the Monotonicity Property Chen, Hongyi Computer Science Dealing with different misclassification costs has been a big problem for classification. Some algorithms can predict quite accurately when assuming the misclassification costs for each class are the same, like most rule induction methods. However, when the misclassification costs change, which is a common phenomenon in reality, these algorithms are not capable of adjusting their results. Some other algorithms, like the Bayesian methods, have the ability to yield probabilities of a certain unclassified example belonging to given classes, which is helpful to make modification on the results according to different misclassification costs. The shortcoming of such algorithms is, when the misclassification costs for each class are the same, they do not generate the most accurate results. This thesis attempts to incorporate the merits of both kinds of algorithms into one. That is, to develop a new algorithm which can predict relatively accurately and can adjust to the change of misclassification costs. The strategy of the new algorithm is to create a weighted voting system. A weighted voting system will evaluate the evidence of the new example belonging to each class, calculate the assessment of probabilities for the example, and assign the example to a certain class according to the probabilities as well as the misclassification costs. The main problem of creating a weighted voting system is to decide the optimal weights of the individual votes. To solve this problem, we will mainly refer to the monotonicity property. People have found the monotonicity property does not only exist in pure monotone systems, but also exists in non-monotone systems. Since the study of the monotonicity property has been a huge success on monotone systems, it is only natural to apply the monotonicity property to non-monotone systems too. This thesis deals only with binary systems. Though such systems hardly exist in practice, this treatment provides concrete ideas for the development of general solution algorithms. After the final algorithm has been formulated, it has been tested on a wide range of randomly generated synthetic datasets. It has also been compared with other existing classifiers. The results indicate this algorithm performs both effectively and efficiently. Warren Liao Jianhua Chen Evangelos Triantaphyllou LSU 2008-06-10 text application/pdf http://etd.lsu.edu/docs/available/etd-05292008-103454/ http://etd.lsu.edu/docs/available/etd-05292008-103454/ en unrestricted I hereby certify that, if appropriate, I have obtained and attached herein a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to LSU or its agents the non-exclusive license to archive and make accessible, under the conditions specified below and in appropriate University policies, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. |
collection |
NDLTD |
language |
en |
format |
Others
|
sources |
NDLTD |
topic |
Computer Science |
spellingShingle |
Computer Science Chen, Hongyi Data Exploration by Using the Monotonicity Property |
description |
Dealing with different misclassification costs has been a big problem for classification. Some algorithms can predict quite accurately when assuming the misclassification costs for each class are the same, like most rule induction methods. However, when the misclassification costs change, which is a common phenomenon in reality, these algorithms are not capable of adjusting their results. Some other algorithms, like the Bayesian methods, have the ability to yield probabilities of a certain unclassified example belonging to given classes, which is helpful to make modification on the results according to different misclassification costs. The shortcoming of such algorithms is, when the misclassification costs for each class are the same, they do not generate the most accurate results.
This thesis attempts to incorporate the merits of both kinds of algorithms into one. That is, to develop a new algorithm which can predict relatively accurately and can adjust to the change of misclassification costs.
The strategy of the new algorithm is to create a weighted voting system. A weighted voting system will evaluate the evidence of the new example belonging to each class, calculate the assessment of probabilities for the example, and assign the example to a certain class according to the probabilities as well as the misclassification costs.
The main problem of creating a weighted voting system is to decide the optimal weights of the individual votes. To solve this problem, we will mainly refer to the monotonicity property. People have found the monotonicity property does not only exist in pure monotone systems, but also exists in non-monotone systems. Since the study of the monotonicity property has been a huge success on monotone systems, it is only natural to apply the monotonicity property to non-monotone systems too.
This thesis deals only with binary systems. Though such systems hardly exist in practice, this treatment provides concrete ideas for the development of general solution algorithms.
After the final algorithm has been formulated, it has been tested on a wide range of randomly generated synthetic datasets. It has also been compared with other existing classifiers. The results indicate this algorithm performs both effectively and efficiently.
|
author2 |
Warren Liao |
author_facet |
Warren Liao Chen, Hongyi |
author |
Chen, Hongyi |
author_sort |
Chen, Hongyi |
title |
Data Exploration by Using the Monotonicity Property |
title_short |
Data Exploration by Using the Monotonicity Property |
title_full |
Data Exploration by Using the Monotonicity Property |
title_fullStr |
Data Exploration by Using the Monotonicity Property |
title_full_unstemmed |
Data Exploration by Using the Monotonicity Property |
title_sort |
data exploration by using the monotonicity property |
publisher |
LSU |
publishDate |
2008 |
url |
http://etd.lsu.edu/docs/available/etd-05292008-103454/ |
work_keys_str_mv |
AT chenhongyi dataexplorationbyusingthemonotonicityproperty |
_version_ |
1716477289414262784 |