A rough set based subspace clustering technique for high dimensional data

Subspace clustering aims at identifying subspaces for cluster formation so that the data is categorized in different perspectives. The conventional subspace clustering algorithms explore dense clusters in all the possible subspaces. These algorithms suffer from the curse of dimensionality. That is,...

Full description

Bibliographic Details
Main Authors:	B. Jaya Lakshmi, M. Shashi, K.B. Madhuri
Format:	Article
Language:	English
Published:	Elsevier 2020-03-01
Series:	Journal of King Saud University: Computer and Information Sciences
Online Access:	http://www.sciencedirect.com/science/article/pii/S1319157817300654

id	doaj-8ea51fd2af654b8290fa80ee1cc076a2
record_format	Article
spelling	doaj-8ea51fd2af654b8290fa80ee1cc076a22020-11-25T03:03:25ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782020-03-01323329334A rough set based subspace clustering technique for high dimensional dataB. Jaya Lakshmi0M. Shashi1K.B. Madhuri2Department of Information Technology, GVP College of Engineering(A), India; Corresponding author.Department of Computer Science and Systems Engineering, Andhra University, IndiaDepartment of Information Technology, GVP College of Engineering(A), IndiaSubspace clustering aims at identifying subspaces for cluster formation so that the data is categorized in different perspectives. The conventional subspace clustering algorithms explore dense clusters in all the possible subspaces. These algorithms suffer from the curse of dimensionality. That is, with the increase in the number of dimensions, the possible number of subspaces to be explored as well as the number of subspace clusters increase exponentially. This makes analysis of clustering result difficult due to high probability of redundant clustering information presented in various subspaces. To handle this problem, a new algorithm called Interesting Subspace Clustering (ISC) is proposed which makes use of attribute dependency measure, γ from Rough Set theory, to identify interesting subspaces. Anti-monotonicity based on Apriori property is used to efficiently prune the subspaces in the process of identifying interesting subspaces. A density based clustering method is used so as to mine arbitrary shaped dense regions as clusters in interesting subspaces. The proposed algorithm identifies non-redundant and interesting subspace clusters of better quality. The size of the clustering result is reduced as well as the mean dimensionality needed to describe the clustering solution compared to existing algorithms, SUBCLU and SCHISM on different datasets. Keywords: Subspace clustering, Density based subspace clustering, Interesting subspace, Attribute dependency measure, Apriori propertyhttp://www.sciencedirect.com/science/article/pii/S1319157817300654
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	B. Jaya Lakshmi M. Shashi K.B. Madhuri
spellingShingle	B. Jaya Lakshmi M. Shashi K.B. Madhuri A rough set based subspace clustering technique for high dimensional data Journal of King Saud University: Computer and Information Sciences
author_facet	B. Jaya Lakshmi M. Shashi K.B. Madhuri
author_sort	B. Jaya Lakshmi
title	A rough set based subspace clustering technique for high dimensional data
title_short	A rough set based subspace clustering technique for high dimensional data
title_full	A rough set based subspace clustering technique for high dimensional data
title_fullStr	A rough set based subspace clustering technique for high dimensional data
title_full_unstemmed	A rough set based subspace clustering technique for high dimensional data
title_sort	rough set based subspace clustering technique for high dimensional data
publisher	Elsevier
series	Journal of King Saud University: Computer and Information Sciences
issn	1319-1578
publishDate	2020-03-01
description	Subspace clustering aims at identifying subspaces for cluster formation so that the data is categorized in different perspectives. The conventional subspace clustering algorithms explore dense clusters in all the possible subspaces. These algorithms suffer from the curse of dimensionality. That is, with the increase in the number of dimensions, the possible number of subspaces to be explored as well as the number of subspace clusters increase exponentially. This makes analysis of clustering result difficult due to high probability of redundant clustering information presented in various subspaces. To handle this problem, a new algorithm called Interesting Subspace Clustering (ISC) is proposed which makes use of attribute dependency measure, γ from Rough Set theory, to identify interesting subspaces. Anti-monotonicity based on Apriori property is used to efficiently prune the subspaces in the process of identifying interesting subspaces. A density based clustering method is used so as to mine arbitrary shaped dense regions as clusters in interesting subspaces. The proposed algorithm identifies non-redundant and interesting subspace clusters of better quality. The size of the clustering result is reduced as well as the mean dimensionality needed to describe the clustering solution compared to existing algorithms, SUBCLU and SCHISM on different datasets. Keywords: Subspace clustering, Density based subspace clustering, Interesting subspace, Attribute dependency measure, Apriori property
url	http://www.sciencedirect.com/science/article/pii/S1319157817300654
work_keys_str_mv	AT bjayalakshmi aroughsetbasedsubspaceclusteringtechniqueforhighdimensionaldata AT mshashi aroughsetbasedsubspaceclusteringtechniqueforhighdimensionaldata AT kbmadhuri aroughsetbasedsubspaceclusteringtechniqueforhighdimensionaldata AT bjayalakshmi roughsetbasedsubspaceclusteringtechniqueforhighdimensionaldata AT mshashi roughsetbasedsubspaceclusteringtechniqueforhighdimensionaldata AT kbmadhuri roughsetbasedsubspaceclusteringtechniqueforhighdimensionaldata
_version_	1724685768551563264

A rough set based subspace clustering technique for high dimensional data

Similar Items