DIVERSITY-BASED ATTRIBUTE WEIGHTING FOR K-MODES CLUSTERING

Abstract Categorical data is a kind of data that is used for computational in computer science. To obtain the information from categorical data input, it needs a clustering algorithm. There are so many clustering algorithms that are given by the researchers. One of the clustering algorithms for c...

Full description

Bibliographic Details
Main Authors: Muhammad Misbachul Huda, Dian Rahma Hayun, Annisaa Sri Indarwanti
Format: Article
Language:English
Published: Universitas Indonesia 2014-08-01
Series:Jurnal Ilmu Komputer dan Informasi
Online Access:http://jiki.cs.ui.ac.id/index.php/jiki/article/view/258
Description
Summary:Abstract Categorical data is a kind of data that is used for computational in computer science. To obtain the information from categorical data input, it needs a clustering algorithm. There are so many clustering algorithms that are given by the researchers. One of the clustering algorithms for categorical data is k-modes. K-modes uses a simple matching approach. This simple matching approach uses similarity values. In K-modes, the two similar objects have similarity value 1, and 0 if it is otherwise. Actually, in each attribute, there are some kinds of different attribute value and each kind of attribute value has different number. The similarity value 0 and 1 is not enough to represent the real semantic distance between a data object and a cluster. Thus in this paper, we generalize a k-modes algorithm for categorical data by adding the weight and diversity value of each attribute value to optimize categorical data clustering.
ISSN:2088-7051
2502-9274