On Clustering Histograms with k-Means by Using Mixed α-Divergences
Clustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric family of distortion measures, called the α-divergences, for clust...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2014-06-01
|
Series: | Entropy |
Subjects: | |
Online Access: | http://www.mdpi.com/1099-4300/16/6/3273 |
id |
doaj-67bab809f23a4f98ae31c50a1cd6b3e6 |
---|---|
record_format |
Article |
spelling |
doaj-67bab809f23a4f98ae31c50a1cd6b3e62020-11-25T00:03:34ZengMDPI AGEntropy1099-43002014-06-011663273330110.3390/e16063273e16063273On Clustering Histograms with k-Means by Using Mixed α-DivergencesFrank Nielsen0Richard Nock1Shun-ichi Amari2Sony Computer Science Laboratories, Inc, Tokyo 141-0022, JapanNICTA and The Australian National University, Locked Bag 9013, Alexandria NSW 1435, AustraliaRIKEN Brain Science Institute, 2-1 Hirosawa Wako City, Saitama 351-0198, JapanClustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric family of distortion measures, called the α-divergences, for clustering histograms. Since it usually makes sense to deal with symmetric divergences in information retrieval systems, we symmetrize the α -divergences using the concept of mixed divergences. First, we present a novel extension of k-means clustering to mixed divergences. Second, we extend the k-means++ seeding to mixed α-divergences and report a guaranteed probabilistic bound. Finally, we describe a soft clustering technique for mixed α-divergences.http://www.mdpi.com/1099-4300/16/6/3273bag-of-Xα-divergenceJeffreys divergencecentroidk-means clusteringk-means seeding |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Frank Nielsen Richard Nock Shun-ichi Amari |
spellingShingle |
Frank Nielsen Richard Nock Shun-ichi Amari On Clustering Histograms with k-Means by Using Mixed α-Divergences Entropy bag-of-X α-divergence Jeffreys divergence centroid k-means clustering k-means seeding |
author_facet |
Frank Nielsen Richard Nock Shun-ichi Amari |
author_sort |
Frank Nielsen |
title |
On Clustering Histograms with k-Means by Using Mixed α-Divergences |
title_short |
On Clustering Histograms with k-Means by Using Mixed α-Divergences |
title_full |
On Clustering Histograms with k-Means by Using Mixed α-Divergences |
title_fullStr |
On Clustering Histograms with k-Means by Using Mixed α-Divergences |
title_full_unstemmed |
On Clustering Histograms with k-Means by Using Mixed α-Divergences |
title_sort |
on clustering histograms with k-means by using mixed α-divergences |
publisher |
MDPI AG |
series |
Entropy |
issn |
1099-4300 |
publishDate |
2014-06-01 |
description |
Clustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric family of distortion measures, called the α-divergences, for clustering histograms. Since it usually makes sense to deal with symmetric divergences in information retrieval systems, we symmetrize the α -divergences using the concept of mixed divergences. First, we present a novel extension of k-means clustering to mixed divergences. Second, we extend the k-means++ seeding to mixed α-divergences and report a guaranteed probabilistic bound. Finally, we describe a soft clustering technique for mixed α-divergences. |
topic |
bag-of-X α-divergence Jeffreys divergence centroid k-means clustering k-means seeding |
url |
http://www.mdpi.com/1099-4300/16/6/3273 |
work_keys_str_mv |
AT franknielsen onclusteringhistogramswithkmeansbyusingmixedadivergences AT richardnock onclusteringhistogramswithkmeansbyusingmixedadivergences AT shunichiamari onclusteringhistogramswithkmeansbyusingmixedadivergences |
_version_ |
1725433199744516096 |