An Optimal and Stable Algorithm for Clustering Numerical Data

In the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications,...

Full description

Bibliographic Details
Main Authors:	Ali Seman, Azizian Mohd Sapawi
Format:	Article
Language:	English
Published:	MDPI AG 2021-06-01
Series:	Algorithms
Subjects:	numerical clustering categorical clustering cluster analysis partitional clustering algorithm fuzzy clustering
Online Access:	https://www.mdpi.com/1999-4893/14/7/197

id	doaj-3bbc638567824795b6d3ace9057b9f1b
record_format	Article
spelling	doaj-3bbc638567824795b6d3ace9057b9f1b2021-07-23T13:26:48ZengMDPI AGAlgorithms1999-48932021-06-011419719710.3390/a14070197An Optimal and Stable Algorithm for Clustering Numerical DataAli Seman0Azizian Mohd Sapawi1Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA (UiTM), Shah Alam 40450, MalaysiaFaculty of Computer and Mathematical Sciences, Universiti Teknologi MARA (UiTM), Shah Alam 40450, MalaysiaIn the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications, optimal and stable clustering is highly desirable. This report introduces a new clustering algorithm called the zero k-approximate modal haplotype (Zk-AMH) algorithm that uses a simple and novel seeding mechanism known as zero-point multidimensional spaces. The Zk-AMH provides cluster optimality and stability, therefore resolving the aforementioned issues. Notably, the Zk-AMH algorithm yielded identical mean scores to maximum, and minimum scores in 100 runs, producing zero standard deviation to show its stability. Additionally, when the Zk-AMH algorithm was applied to eight datasets, it achieved the highest mean scores for four datasets, produced an approximately equal score for one dataset, and yielded marginally lower scores for the other three datasets. With its optimality and stability, the Zk-AMH algorithm could be a suitable alternative for developing future clustering tools.https://www.mdpi.com/1999-4893/14/7/197numerical clusteringcategorical clusteringcluster analysispartitional clustering algorithmfuzzy clustering
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Ali Seman Azizian Mohd Sapawi
spellingShingle	Ali Seman Azizian Mohd Sapawi An Optimal and Stable Algorithm for Clustering Numerical Data Algorithms numerical clustering categorical clustering cluster analysis partitional clustering algorithm fuzzy clustering
author_facet	Ali Seman Azizian Mohd Sapawi
author_sort	Ali Seman
title	An Optimal and Stable Algorithm for Clustering Numerical Data
title_short	An Optimal and Stable Algorithm for Clustering Numerical Data
title_full	An Optimal and Stable Algorithm for Clustering Numerical Data
title_fullStr	An Optimal and Stable Algorithm for Clustering Numerical Data
title_full_unstemmed	An Optimal and Stable Algorithm for Clustering Numerical Data
title_sort	optimal and stable algorithm for clustering numerical data
publisher	MDPI AG
series	Algorithms
issn	1999-4893
publishDate	2021-06-01
description	In the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications, optimal and stable clustering is highly desirable. This report introduces a new clustering algorithm called the zero k-approximate modal haplotype (Zk-AMH) algorithm that uses a simple and novel seeding mechanism known as zero-point multidimensional spaces. The Zk-AMH provides cluster optimality and stability, therefore resolving the aforementioned issues. Notably, the Zk-AMH algorithm yielded identical mean scores to maximum, and minimum scores in 100 runs, producing zero standard deviation to show its stability. Additionally, when the Zk-AMH algorithm was applied to eight datasets, it achieved the highest mean scores for four datasets, produced an approximately equal score for one dataset, and yielded marginally lower scores for the other three datasets. With its optimality and stability, the Zk-AMH algorithm could be a suitable alternative for developing future clustering tools.
topic	numerical clustering categorical clustering cluster analysis partitional clustering algorithm fuzzy clustering
url	https://www.mdpi.com/1999-4893/14/7/197
work_keys_str_mv	AT aliseman anoptimalandstablealgorithmforclusteringnumericaldata AT azizianmohdsapawi anoptimalandstablealgorithmforclusteringnumericaldata AT aliseman optimalandstablealgorithmforclusteringnumericaldata AT azizianmohdsapawi optimalandstablealgorithmforclusteringnumericaldata
_version_	1721289888781828096

An Optimal and Stable Algorithm for Clustering Numerical Data

Similar Items