An Optimal and Stable Algorithm for Clustering Numerical Data
In the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications,...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-06-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/14/7/197 |
id |
doaj-3bbc638567824795b6d3ace9057b9f1b |
---|---|
record_format |
Article |
spelling |
doaj-3bbc638567824795b6d3ace9057b9f1b2021-07-23T13:26:48ZengMDPI AGAlgorithms1999-48932021-06-011419719710.3390/a14070197An Optimal and Stable Algorithm for Clustering Numerical DataAli Seman0Azizian Mohd Sapawi1Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA (UiTM), Shah Alam 40450, MalaysiaFaculty of Computer and Mathematical Sciences, Universiti Teknologi MARA (UiTM), Shah Alam 40450, MalaysiaIn the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications, optimal and stable clustering is highly desirable. This report introduces a new clustering algorithm called the zero k-approximate modal haplotype (Zk-AMH) algorithm that uses a simple and novel seeding mechanism known as zero-point multidimensional spaces. The Zk-AMH provides cluster optimality and stability, therefore resolving the aforementioned issues. Notably, the Zk-AMH algorithm yielded identical mean scores to maximum, and minimum scores in 100 runs, producing zero standard deviation to show its stability. Additionally, when the Zk-AMH algorithm was applied to eight datasets, it achieved the highest mean scores for four datasets, produced an approximately equal score for one dataset, and yielded marginally lower scores for the other three datasets. With its optimality and stability, the Zk-AMH algorithm could be a suitable alternative for developing future clustering tools.https://www.mdpi.com/1999-4893/14/7/197numerical clusteringcategorical clusteringcluster analysispartitional clustering algorithmfuzzy clustering |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ali Seman Azizian Mohd Sapawi |
spellingShingle |
Ali Seman Azizian Mohd Sapawi An Optimal and Stable Algorithm for Clustering Numerical Data Algorithms numerical clustering categorical clustering cluster analysis partitional clustering algorithm fuzzy clustering |
author_facet |
Ali Seman Azizian Mohd Sapawi |
author_sort |
Ali Seman |
title |
An Optimal and Stable Algorithm for Clustering Numerical Data |
title_short |
An Optimal and Stable Algorithm for Clustering Numerical Data |
title_full |
An Optimal and Stable Algorithm for Clustering Numerical Data |
title_fullStr |
An Optimal and Stable Algorithm for Clustering Numerical Data |
title_full_unstemmed |
An Optimal and Stable Algorithm for Clustering Numerical Data |
title_sort |
optimal and stable algorithm for clustering numerical data |
publisher |
MDPI AG |
series |
Algorithms |
issn |
1999-4893 |
publishDate |
2021-06-01 |
description |
In the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications, optimal and stable clustering is highly desirable. This report introduces a new clustering algorithm called the zero k-approximate modal haplotype (Zk-AMH) algorithm that uses a simple and novel seeding mechanism known as zero-point multidimensional spaces. The Zk-AMH provides cluster optimality and stability, therefore resolving the aforementioned issues. Notably, the Zk-AMH algorithm yielded identical mean scores to maximum, and minimum scores in 100 runs, producing zero standard deviation to show its stability. Additionally, when the Zk-AMH algorithm was applied to eight datasets, it achieved the highest mean scores for four datasets, produced an approximately equal score for one dataset, and yielded marginally lower scores for the other three datasets. With its optimality and stability, the Zk-AMH algorithm could be a suitable alternative for developing future clustering tools. |
topic |
numerical clustering categorical clustering cluster analysis partitional clustering algorithm fuzzy clustering |
url |
https://www.mdpi.com/1999-4893/14/7/197 |
work_keys_str_mv |
AT aliseman anoptimalandstablealgorithmforclusteringnumericaldata AT azizianmohdsapawi anoptimalandstablealgorithmforclusteringnumericaldata AT aliseman optimalandstablealgorithmforclusteringnumericaldata AT azizianmohdsapawi optimalandstablealgorithmforclusteringnumericaldata |
_version_ |
1721289888781828096 |