A combined data mining approach using rough set theory and case-based reasoning in medical datasets

Case-based reasoning (CBR) is the process of solving new cases by retrieving the most relevant ones from an existing knowledge-base. Since, irrelevant or redundant features not only remarkably increase memory requirements but also the time complexity of the case retrieval, reducing the number of dim...

Full description

Bibliographic Details
Main Authors: Mohammad Taghi Rezvan, Ali Zeinal Hamadani, Babak Saffari, Ali Shalbafzadeh
Format: Article
Language:English
Published: Growing Science 2014-06-01
Series:Decision Science Letters
Subjects:
Online Access:http://www.growingscience.com/dsl/Vol3/dsl_2014_16.pdf
Description
Summary:Case-based reasoning (CBR) is the process of solving new cases by retrieving the most relevant ones from an existing knowledge-base. Since, irrelevant or redundant features not only remarkably increase memory requirements but also the time complexity of the case retrieval, reducing the number of dimensions is an issue worth considering. This paper uses rough set theory (RST) in order to reduce the number of dimensions in a CBR classifier with the aim of increasing accuracy and efficiency. CBR exploits a distance based co-occurrence of categorical data to measure similarity of cases. This distance is based on the proportional distribution of different categorical values of features. The weight used for a feature is the average of co-occurrence values of the features. The combination of RST and CBR has been applied to real categorical datasets of Wisconsin Breast Cancer, Lymphography, and Primary cancer. The 5-fold cross validation method is used to evaluate the performance of the proposed approach. The results show that this combined approach lowers computational costs and improves performance metrics including accuracy and interpretability compared to other approaches developed in the literature.
ISSN:1929-5804
1929-5812