A new AntTree-based algorithm for clustering short-text corpora
Research work on "short-text clustering" is a very important research area due to the current tendency for people to use "small-language", e.g. blogs, textmessaging and others. In some recent works, new bioinspired clustering algorithms have been proposed to deal with this diffic...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Postgraduate Office, School of Computer Science, Universidad Nacional de La Plata
2010-04-01
|
Series: | Journal of Computer Science and Technology |
Subjects: | |
Online Access: | https://journal.info.unlp.edu.ar/JCST/article/view/708 |
id |
doaj-ebbb03d42c06452597644bf3f6ba7323 |
---|---|
record_format |
Article |
spelling |
doaj-ebbb03d42c06452597644bf3f6ba73232021-05-05T13:55:14ZengPostgraduate Office, School of Computer Science, Universidad Nacional de La PlataJournal of Computer Science and Technology1666-60461666-60382010-04-01100117402A new AntTree-based algorithm for clustering short-text corporaMarcelo Luis Errecalde0Diego Alejandro Ingaramo1Paolo Rosso2Development and Research Laboratory in Computacional Intelligence (LIDIC), Universidad Nacional de San Luis, San Luis, ArgentinaDevelopment and Research Laboratory in Computacional Intelligence (LIDIC), Universidad Nacional de San Luis, San Luis, ArgentinaNatural Language Engineering Lab.,ELiRF, Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Valencia, SpainResearch work on "short-text clustering" is a very important research area due to the current tendency for people to use "small-language", e.g. blogs, textmessaging and others. In some recent works, new bioinspired clustering algorithms have been proposed to deal with this difficult problem and novel uses of Internal Clustering Validity Measures have also been presented. In this work, a new AntTree-based approach is proposed for this task. It integrates information on the Silhouette Coefficient and the concept of attraction of a cluster in different stages of the clustering process. The proposal achieves results comparable to the best reported results in this area, showing an interesting stability in the quality of the results and presenting some interesting capabilities as a general improvement method for arbitrary clustering approaches.https://journal.info.unlp.edu.ar/JCST/article/view/708internal validity measuresanttreeshort-text clusteringbio-inspired algorithmsinternal validity measuressilhouette coefficient |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Marcelo Luis Errecalde Diego Alejandro Ingaramo Paolo Rosso |
spellingShingle |
Marcelo Luis Errecalde Diego Alejandro Ingaramo Paolo Rosso A new AntTree-based algorithm for clustering short-text corpora Journal of Computer Science and Technology internal validity measures anttree short-text clustering bio-inspired algorithms internal validity measures silhouette coefficient |
author_facet |
Marcelo Luis Errecalde Diego Alejandro Ingaramo Paolo Rosso |
author_sort |
Marcelo Luis Errecalde |
title |
A new AntTree-based algorithm for clustering short-text corpora |
title_short |
A new AntTree-based algorithm for clustering short-text corpora |
title_full |
A new AntTree-based algorithm for clustering short-text corpora |
title_fullStr |
A new AntTree-based algorithm for clustering short-text corpora |
title_full_unstemmed |
A new AntTree-based algorithm for clustering short-text corpora |
title_sort |
new anttree-based algorithm for clustering short-text corpora |
publisher |
Postgraduate Office, School of Computer Science, Universidad Nacional de La Plata |
series |
Journal of Computer Science and Technology |
issn |
1666-6046 1666-6038 |
publishDate |
2010-04-01 |
description |
Research work on "short-text clustering" is a very important research area due to the current tendency for people to use "small-language", e.g. blogs, textmessaging and others. In some recent works, new bioinspired clustering algorithms have been proposed to deal with this difficult problem and novel uses of Internal Clustering Validity Measures have also been presented. In this work, a new AntTree-based approach is proposed for this task. It integrates information on the Silhouette Coefficient and the concept of attraction of a cluster in different stages of the clustering process. The proposal achieves results comparable to the best reported results in this area, showing an interesting stability in the quality of the results and presenting some interesting capabilities as a general improvement method for arbitrary clustering approaches. |
topic |
internal validity measures anttree short-text clustering bio-inspired algorithms internal validity measures silhouette coefficient |
url |
https://journal.info.unlp.edu.ar/JCST/article/view/708 |
work_keys_str_mv |
AT marceloluiserrecalde anewanttreebasedalgorithmforclusteringshorttextcorpora AT diegoalejandroingaramo anewanttreebasedalgorithmforclusteringshorttextcorpora AT paolorosso anewanttreebasedalgorithmforclusteringshorttextcorpora AT marceloluiserrecalde newanttreebasedalgorithmforclusteringshorttextcorpora AT diegoalejandroingaramo newanttreebasedalgorithmforclusteringshorttextcorpora AT paolorosso newanttreebasedalgorithmforclusteringshorttextcorpora |
_version_ |
1721460549997297664 |