A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm
Abstract Clustering, a traditional machine learning method, plays a significant role in data analysis. Most clustering algorithms depend on a predetermined exact number of clusters, whereas, in practice, clusters are usually unpredictable. Although the Elbow method is one of the most commonly used m...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2021-02-01
|
Series: | EURASIP Journal on Wireless Communications and Networking |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13638-021-01910-w |
id |
doaj-1186c0ec45514b1e9812c8bb15b720a6 |
---|---|
record_format |
Article |
spelling |
doaj-1186c0ec45514b1e9812c8bb15b720a62021-02-21T12:28:56ZengSpringerOpenEURASIP Journal on Wireless Communications and Networking1687-14992021-02-012021111610.1186/s13638-021-01910-wA quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithmCongming Shi0Bingtao Wei1Shoulin Wei2Wen Wang3Hai Liu4Jialei Liu5School of Software Engineering, Anyang Normal UniversityFaculty of Information Engineering and Automation, Kunming University of Science and TechnologyFaculty of Information Engineering and Automation, Kunming University of Science and TechnologyFaculty of Information Engineering and Automation, Kunming University of Science and TechnologySchool of Software Engineering, Anyang Normal UniversitySchool of Software Engineering, Anyang Normal UniversityAbstract Clustering, a traditional machine learning method, plays a significant role in data analysis. Most clustering algorithms depend on a predetermined exact number of clusters, whereas, in practice, clusters are usually unpredictable. Although the Elbow method is one of the most commonly used methods to discriminate the optimal cluster number, the discriminant of the number of clusters depends on the manual identification of the elbow points on the visualization curve. Thus, experienced analysts cannot clearly identify the elbow point from the plotted curve when the plotted curve is fairly smooth. To solve this problem, a new elbow point discriminant method is proposed to yield a statistical metric that estimates an optimal cluster number when clustering on a dataset. First, the average degree of distortion obtained by the Elbow method is normalized to the range of 0 to 10. Second, the normalized results are used to calculate the cosine of intersection angles between elbow points. Third, this calculated cosine of intersection angles and the arccosine theorem are used to compute the intersection angles between elbow points. Finally, the index of the above-computed minimal intersection angles between elbow points is used as the estimated potential optimal cluster number. The experimental results based on simulated datasets and a well-known public dataset (Iris Dataset) demonstrated that the estimated optimal cluster number obtained by our newly proposed method is better than the widely used Silhouette method.https://doi.org/10.1186/s13638-021-01910-wMachine learningClusteringElbow methodSilhouette coefficientCosine law |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Congming Shi Bingtao Wei Shoulin Wei Wen Wang Hai Liu Jialei Liu |
spellingShingle |
Congming Shi Bingtao Wei Shoulin Wei Wen Wang Hai Liu Jialei Liu A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm EURASIP Journal on Wireless Communications and Networking Machine learning Clustering Elbow method Silhouette coefficient Cosine law |
author_facet |
Congming Shi Bingtao Wei Shoulin Wei Wen Wang Hai Liu Jialei Liu |
author_sort |
Congming Shi |
title |
A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm |
title_short |
A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm |
title_full |
A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm |
title_fullStr |
A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm |
title_full_unstemmed |
A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm |
title_sort |
quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm |
publisher |
SpringerOpen |
series |
EURASIP Journal on Wireless Communications and Networking |
issn |
1687-1499 |
publishDate |
2021-02-01 |
description |
Abstract Clustering, a traditional machine learning method, plays a significant role in data analysis. Most clustering algorithms depend on a predetermined exact number of clusters, whereas, in practice, clusters are usually unpredictable. Although the Elbow method is one of the most commonly used methods to discriminate the optimal cluster number, the discriminant of the number of clusters depends on the manual identification of the elbow points on the visualization curve. Thus, experienced analysts cannot clearly identify the elbow point from the plotted curve when the plotted curve is fairly smooth. To solve this problem, a new elbow point discriminant method is proposed to yield a statistical metric that estimates an optimal cluster number when clustering on a dataset. First, the average degree of distortion obtained by the Elbow method is normalized to the range of 0 to 10. Second, the normalized results are used to calculate the cosine of intersection angles between elbow points. Third, this calculated cosine of intersection angles and the arccosine theorem are used to compute the intersection angles between elbow points. Finally, the index of the above-computed minimal intersection angles between elbow points is used as the estimated potential optimal cluster number. The experimental results based on simulated datasets and a well-known public dataset (Iris Dataset) demonstrated that the estimated optimal cluster number obtained by our newly proposed method is better than the widely used Silhouette method. |
topic |
Machine learning Clustering Elbow method Silhouette coefficient Cosine law |
url |
https://doi.org/10.1186/s13638-021-01910-w |
work_keys_str_mv |
AT congmingshi aquantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT bingtaowei aquantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT shoulinwei aquantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT wenwang aquantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT hailiu aquantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT jialeiliu aquantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT congmingshi quantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT bingtaowei quantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT shoulinwei quantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT wenwang quantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT hailiu quantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm AT jialeiliu quantitativediscriminantmethodofelbowpointfortheoptimalnumberofclustersinclusteringalgorithm |
_version_ |
1724257997139476480 |