Evaluation of Named Entity Recognition Algorithms in Short Texts

Abstract: One of the major consequences of the growth of social networks has been the generation of huge volumes of content. The text that is generated in social networks constitutes a new type of content, that is short, informal, lacking grammar in some cases, and noise prone. Given the volume of...

Full description

Bibliographic Details
Main Authors:	Edgar Casasola Murillo, Raquel Fonseca
Format:	Article
Language:	English
Published:	Centro Latinoamericano de Estudios en Informática 2017-04-01
Series:	CLEI Electronic Journal
Subjects:	Information Retrieval Entity Recognition ER Named Entity Recognition NER corpus
Online Access:	http://clei.org/cleiej-beta/index.php/cleiej/article/view/9

id	doaj-b7cee2dc6b5947cf92f4ca1bf3be81bd
record_format	Article
spelling	doaj-b7cee2dc6b5947cf92f4ca1bf3be81bd2020-11-25T00:35:18ZengCentro Latinoamericano de Estudios en InformáticaCLEI Electronic Journal0717-50002017-04-0120110.19153/cleiej.20.1.4Evaluation of Named Entity Recognition Algorithms in Short TextsEdgar Casasola Murillo0Raquel Fonseca1Universidad de Costa RicaUniversidad de Costa RicaAbstract: One of the major consequences of the growth of social networks has been the generation of huge volumes of content. The text that is generated in social networks constitutes a new type of content, that is short, informal, lacking grammar in some cases, and noise prone. Given the volume of information that is produced every day, a manual processing of this data is unpractical, causing the need of exploring and applying automatic processing strategies, like Entity Recognition (ER). It becomes necessary to evaluate the performance of traditional ER algorithms in corpus with those characteristics. This paper presents the results of applying AlchemyAPI y Dandelion API algorithms in a corpus provided by The SemEval-2015 Aspect Based Sentiment Analysis Conference. The entities recognized by each algorithm were compared against the ones annotated in the collection in order to calculate their precision and recall. Dandelion API got better results than AlchemyAPI with the given corpus. Spanish Abstract: Una de las principales consecuencias del auge actual de las redes sociales es la generación de grandes volúmenes de información. El texto generado en estas redes corresponde a un nuevo género de texto: corto, informal, gramaticalmente deficiente y propenso a ruido. Debido a la tasa de producción de la información, el procesamiento manual resulta poco práctico, surgiendo así la necesidad de aplicar estrategias de procesamiento automático, como Reconocimiento de Entidades (RE). Debido a las características del contenido, surge además la necesidad de evaluar el desempeño de los algoritmos tradicionales, en corpus extraídos de estas redes sociales. Este trabajo presenta los resultados obtenidos al aplicar los algoritmos de AlchemyAPI y Dandelion API en un corpus provisto por la conferencia The SemEval-2015 Aspect Based Sentiment Analysis. Las entidades reconocidas por cada algoritmo fueron comparadas con las anotadas en la colección, para calcular su precisión y exhaustividad. Dandelion API obtuvo mejores resultados que AlchemyAPI en el corpus dado. http://clei.org/cleiej-beta/index.php/cleiej/article/view/9Information RetrievalEntity RecognitionERNamed Entity RecognitionNERcorpus
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Edgar Casasola Murillo Raquel Fonseca
spellingShingle	Edgar Casasola Murillo Raquel Fonseca Evaluation of Named Entity Recognition Algorithms in Short Texts CLEI Electronic Journal Information Retrieval Entity Recognition ER Named Entity Recognition NER corpus
author_facet	Edgar Casasola Murillo Raquel Fonseca
author_sort	Edgar Casasola Murillo
title	Evaluation of Named Entity Recognition Algorithms in Short Texts
title_short	Evaluation of Named Entity Recognition Algorithms in Short Texts
title_full	Evaluation of Named Entity Recognition Algorithms in Short Texts
title_fullStr	Evaluation of Named Entity Recognition Algorithms in Short Texts
title_full_unstemmed	Evaluation of Named Entity Recognition Algorithms in Short Texts
title_sort	evaluation of named entity recognition algorithms in short texts
publisher	Centro Latinoamericano de Estudios en Informática
series	CLEI Electronic Journal
issn	0717-5000
publishDate	2017-04-01
description	Abstract: One of the major consequences of the growth of social networks has been the generation of huge volumes of content. The text that is generated in social networks constitutes a new type of content, that is short, informal, lacking grammar in some cases, and noise prone. Given the volume of information that is produced every day, a manual processing of this data is unpractical, causing the need of exploring and applying automatic processing strategies, like Entity Recognition (ER). It becomes necessary to evaluate the performance of traditional ER algorithms in corpus with those characteristics. This paper presents the results of applying AlchemyAPI y Dandelion API algorithms in a corpus provided by The SemEval-2015 Aspect Based Sentiment Analysis Conference. The entities recognized by each algorithm were compared against the ones annotated in the collection in order to calculate their precision and recall. Dandelion API got better results than AlchemyAPI with the given corpus. Spanish Abstract: Una de las principales consecuencias del auge actual de las redes sociales es la generación de grandes volúmenes de información. El texto generado en estas redes corresponde a un nuevo género de texto: corto, informal, gramaticalmente deficiente y propenso a ruido. Debido a la tasa de producción de la información, el procesamiento manual resulta poco práctico, surgiendo así la necesidad de aplicar estrategias de procesamiento automático, como Reconocimiento de Entidades (RE). Debido a las características del contenido, surge además la necesidad de evaluar el desempeño de los algoritmos tradicionales, en corpus extraídos de estas redes sociales. Este trabajo presenta los resultados obtenidos al aplicar los algoritmos de AlchemyAPI y Dandelion API en un corpus provisto por la conferencia The SemEval-2015 Aspect Based Sentiment Analysis. Las entidades reconocidas por cada algoritmo fueron comparadas con las anotadas en la colección, para calcular su precisión y exhaustividad. Dandelion API obtuvo mejores resultados que AlchemyAPI en el corpus dado.
topic	Information Retrieval Entity Recognition ER Named Entity Recognition NER corpus
url	http://clei.org/cleiej-beta/index.php/cleiej/article/view/9
work_keys_str_mv	AT edgarcasasolamurillo evaluationofnamedentityrecognitionalgorithmsinshorttexts AT raquelfonseca evaluationofnamedentityrecognitionalgorithmsinshorttexts
_version_	1725309170847055872

Evaluation of Named Entity Recognition Algorithms in Short Texts

Similar Items