Evaluation of Clustering Validity

Clustering is a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set. As a consequence, in most applications the resulting clustering scheme requires some sort of evaluation as regards its val...

Full description

Bibliographic Details
Main Authors: Rudhwan Sideek, Ghaydaa Al-Talib
Format: Article
Language:Arabic
Published: Mosul University 2008-12-01
Series:Al-Rafidain Journal of Computer Sciences and Mathematics
Subjects:
sd
Online Access:https://csmj.mosuljournals.com/article_163987_91789d23691cfa7cb618ac4707e353ce.pdf
id doaj-3781140d1e534856b5a3b8703036774e
record_format Article
spelling doaj-3781140d1e534856b5a3b8703036774e2020-11-25T04:07:16ZaraMosul UniversityAl-Rafidain Journal of Computer Sciences and Mathematics 1815-48162311-79902008-12-0152799710.33899/csmj.2008.163987163987Evaluation of Clustering ValidityRudhwan Sideek0Ghaydaa Al-Talib1Technical College Technical Education Authority / MosulCollege of Computer Sciences and mathematics University of Mosul, Mosul, IraqClustering is a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set. As a consequence, in most applications the resulting clustering scheme requires some sort of evaluation as regards its validity.             In this paper, we present a clustering validity procedure, which evaluates the results of clustering algorithms on data sets. We define a validity indexes, S_Dbw & SD, based on well-defined clustering criteria enabling the selection of the optimal input parameters values for a clustering algorithm that result in the best partitioning of a data set.             We evaluate the reliability of our indexes experimentally, considering clustering algorithm (K_Means) on real data sets. Our approach is performed favorably in finding the correct number of clusters fitting a data set.https://csmj.mosuljournals.com/article_163987_91789d23691cfa7cb618ac4707e353ce.pdfdata miningk_meanss_dbwsd
collection DOAJ
language Arabic
format Article
sources DOAJ
author Rudhwan Sideek
Ghaydaa Al-Talib
spellingShingle Rudhwan Sideek
Ghaydaa Al-Talib
Evaluation of Clustering Validity
Al-Rafidain Journal of Computer Sciences and Mathematics
data mining
k_means
s_dbw
sd
author_facet Rudhwan Sideek
Ghaydaa Al-Talib
author_sort Rudhwan Sideek
title Evaluation of Clustering Validity
title_short Evaluation of Clustering Validity
title_full Evaluation of Clustering Validity
title_fullStr Evaluation of Clustering Validity
title_full_unstemmed Evaluation of Clustering Validity
title_sort evaluation of clustering validity
publisher Mosul University
series Al-Rafidain Journal of Computer Sciences and Mathematics
issn 1815-4816
2311-7990
publishDate 2008-12-01
description Clustering is a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set. As a consequence, in most applications the resulting clustering scheme requires some sort of evaluation as regards its validity.             In this paper, we present a clustering validity procedure, which evaluates the results of clustering algorithms on data sets. We define a validity indexes, S_Dbw & SD, based on well-defined clustering criteria enabling the selection of the optimal input parameters values for a clustering algorithm that result in the best partitioning of a data set.             We evaluate the reliability of our indexes experimentally, considering clustering algorithm (K_Means) on real data sets. Our approach is performed favorably in finding the correct number of clusters fitting a data set.
topic data mining
k_means
s_dbw
sd
url https://csmj.mosuljournals.com/article_163987_91789d23691cfa7cb618ac4707e353ce.pdf
work_keys_str_mv AT rudhwansideek evaluationofclusteringvalidity
AT ghaydaaaltalib evaluationofclusteringvalidity
_version_ 1724429384589246464