Summary: | As one of the most popular clustering techniques, graph clustering has attracted many researchers in the field of machine learning and data mining. Generally speaking, graph clustering partitions the data points into different categories according to their pairwise similarities. Therefore, the clustering performance is largely determined by the quality of the similarity graph. The similarity graph is usually constructed based on the data points' distances. However, the data structure may be corrupted by outliers. To deal with the outliers, we propose a capped l<sub>1</sub> -norm sparse representation method (CSR) in this paper. The main contribution of this paper is threefold: 1) a similarity graph with clear cluster structure is learned by employing sparse representation with proper constraints; (2) the capped l<sub>1</sub> -norm loss is utilized to remove the outliers, which ensures the graph quality; and 3) an iterative algorithm is developed to optimize the proposed non-convex problem. The extensive experiments on real-world datasets show the superiority of the proposed method over the state-of-the-art, and demonstrate its robustness to the outliers.
|