A multicriteria decision making approach for estimating the number of clusters in a data set.

Determining the number of clusters in a data set is an essential yet difficult step in cluster analysis. Since this task involves more than one criterion, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper proposes a multiple criteria decision making (MCDM)-based app...

Full description

Bibliographic Details
Main Authors: Yi Peng, Yong Zhang, Gang Kou, Yong Shi
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3411440?pdf=render
id doaj-9dd2bbeb86724576a66aa6a2a0eb1890
record_format Article
spelling doaj-9dd2bbeb86724576a66aa6a2a0eb18902020-11-25T02:08:43ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-0177e4171310.1371/journal.pone.0041713A multicriteria decision making approach for estimating the number of clusters in a data set.Yi PengYong ZhangGang KouYong ShiDetermining the number of clusters in a data set is an essential yet difficult step in cluster analysis. Since this task involves more than one criterion, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper proposes a multiple criteria decision making (MCDM)-based approach to estimate the number of clusters for a given data set. In this approach, MCDM methods consider different numbers of clusters as alternatives and the outputs of any clustering algorithm on validity measures as criteria. The proposed method is examined by an experimental study using three MCDM methods, the well-known clustering algorithm--k-means, ten relative measures, and fifteen public-domain UCI machine learning data sets. The results show that MCDM methods work fairly well in estimating the number of clusters in the data and outperform the ten relative measures considered in the study.http://europepmc.org/articles/PMC3411440?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Yi Peng
Yong Zhang
Gang Kou
Yong Shi
spellingShingle Yi Peng
Yong Zhang
Gang Kou
Yong Shi
A multicriteria decision making approach for estimating the number of clusters in a data set.
PLoS ONE
author_facet Yi Peng
Yong Zhang
Gang Kou
Yong Shi
author_sort Yi Peng
title A multicriteria decision making approach for estimating the number of clusters in a data set.
title_short A multicriteria decision making approach for estimating the number of clusters in a data set.
title_full A multicriteria decision making approach for estimating the number of clusters in a data set.
title_fullStr A multicriteria decision making approach for estimating the number of clusters in a data set.
title_full_unstemmed A multicriteria decision making approach for estimating the number of clusters in a data set.
title_sort multicriteria decision making approach for estimating the number of clusters in a data set.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2012-01-01
description Determining the number of clusters in a data set is an essential yet difficult step in cluster analysis. Since this task involves more than one criterion, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper proposes a multiple criteria decision making (MCDM)-based approach to estimate the number of clusters for a given data set. In this approach, MCDM methods consider different numbers of clusters as alternatives and the outputs of any clustering algorithm on validity measures as criteria. The proposed method is examined by an experimental study using three MCDM methods, the well-known clustering algorithm--k-means, ten relative measures, and fifteen public-domain UCI machine learning data sets. The results show that MCDM methods work fairly well in estimating the number of clusters in the data and outperform the ten relative measures considered in the study.
url http://europepmc.org/articles/PMC3411440?pdf=render
work_keys_str_mv AT yipeng amulticriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT yongzhang amulticriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT gangkou amulticriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT yongshi amulticriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT yipeng multicriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT yongzhang multicriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT gangkou multicriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT yongshi multicriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
_version_ 1724925824772079616