Self-Expressive Kernel Subspace Clustering Algorithm for Categorical Data with Embedded Feature Selection

Kernel clustering of categorical data is a useful tool to process the separable datasets and has been employed in many disciplines. Despite recent efforts, existing methods for kernel clustering remain a significant challenge due to the assumption of feature independence and equal weights. In this s...

Full description

Bibliographic Details
Main Authors: Hui Chen, Kunpeng Xu, Lifei Chen, Qingshan Jiang
Format: Article
Language:English
Published: MDPI AG 2021-07-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/9/14/1680
Description
Summary:Kernel clustering of categorical data is a useful tool to process the separable datasets and has been employed in many disciplines. Despite recent efforts, existing methods for kernel clustering remain a significant challenge due to the assumption of feature independence and equal weights. In this study, we propose a self-expressive kernel subspace clustering algorithm for categorical data (SKSCC) using the self-expressive kernel density estimation (SKDE) scheme, as well as a new feature-weighted non-linear similarity measurement. In the SKSCC algorithm, we propose an effective non-linear optimization method to solve the clustering algorithm’s objective function, which not only considers the relationship between attributes in a non-linear space but also assigns a weight to each attribute in the algorithm to measure the degree of correlation. A series of experiments on some widely used synthetic and real-world datasets demonstrated the better effectiveness and efficiency of the proposed algorithm compared with other state-of-the-art methods, in terms of non-linear relationship exploration among attributes.
ISSN:2227-7390