Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study
Sentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance w...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Sciendo
2019-06-01
|
Series: | Foundations of Computing and Decision Sciences |
Subjects: | |
Online Access: | https://doi.org/10.2478/fcds-2019-0009 |
id |
doaj-c5b83e5d9b8e4769a224da2430cc300b |
---|---|
record_format |
Article |
spelling |
doaj-c5b83e5d9b8e4769a224da2430cc300b2021-09-05T21:00:54ZengSciendoFoundations of Computing and Decision Sciences2300-34052019-06-0144215117810.2478/fcds-2019-0009fcds-2019-0009Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental StudyLango Mateusz0Institute of Computing Sciences, Poznan University of Technology, Poznań, PolandSentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance which often compromises the prediction capabilities of learning algorithms was scarcely studied. In this work, we aim to bridge the gap between imbalanced learning and sentiment analysis. An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets, is carried out in order to analyze the usefulness of imbalanced learning methods for sentiment classification. Moreover, the data difficulty factors — commonly studied in imbalanced learning — are investigated on sentiment corpora to evaluate the impact of class imbalance.https://doi.org/10.2478/fcds-2019-0009sentiment analysisimbalanced datamulti-class learningdata difficulty factorstext classification |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Lango Mateusz |
spellingShingle |
Lango Mateusz Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study Foundations of Computing and Decision Sciences sentiment analysis imbalanced data multi-class learning data difficulty factors text classification |
author_facet |
Lango Mateusz |
author_sort |
Lango Mateusz |
title |
Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study |
title_short |
Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study |
title_full |
Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study |
title_fullStr |
Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study |
title_full_unstemmed |
Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study |
title_sort |
tackling the problem of class imbalance in multi-class sentiment classification: an experimental study |
publisher |
Sciendo |
series |
Foundations of Computing and Decision Sciences |
issn |
2300-3405 |
publishDate |
2019-06-01 |
description |
Sentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance which often compromises the prediction capabilities of learning algorithms was scarcely studied. In this work, we aim to bridge the gap between imbalanced learning and sentiment analysis. An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets, is carried out in order to analyze the usefulness of imbalanced learning methods for sentiment classification. Moreover, the data difficulty factors — commonly studied in imbalanced learning — are investigated on sentiment corpora to evaluate the impact of class imbalance. |
topic |
sentiment analysis imbalanced data multi-class learning data difficulty factors text classification |
url |
https://doi.org/10.2478/fcds-2019-0009 |
work_keys_str_mv |
AT langomateusz tacklingtheproblemofclassimbalanceinmulticlasssentimentclassificationanexperimentalstudy |
_version_ |
1717782093186990080 |