Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study

Sentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance w...

Full description

Bibliographic Details
Main Author: Lango Mateusz
Format: Article
Language:English
Published: Sciendo 2019-06-01
Series:Foundations of Computing and Decision Sciences
Subjects:
Online Access:https://doi.org/10.2478/fcds-2019-0009
id doaj-c5b83e5d9b8e4769a224da2430cc300b
record_format Article
spelling doaj-c5b83e5d9b8e4769a224da2430cc300b2021-09-05T21:00:54ZengSciendoFoundations of Computing and Decision Sciences2300-34052019-06-0144215117810.2478/fcds-2019-0009fcds-2019-0009Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental StudyLango Mateusz0Institute of Computing Sciences, Poznan University of Technology, Poznań, PolandSentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance which often compromises the prediction capabilities of learning algorithms was scarcely studied. In this work, we aim to bridge the gap between imbalanced learning and sentiment analysis. An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets, is carried out in order to analyze the usefulness of imbalanced learning methods for sentiment classification. Moreover, the data difficulty factors — commonly studied in imbalanced learning — are investigated on sentiment corpora to evaluate the impact of class imbalance.https://doi.org/10.2478/fcds-2019-0009sentiment analysisimbalanced datamulti-class learningdata difficulty factorstext classification
collection DOAJ
language English
format Article
sources DOAJ
author Lango Mateusz
spellingShingle Lango Mateusz
Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study
Foundations of Computing and Decision Sciences
sentiment analysis
imbalanced data
multi-class learning
data difficulty factors
text classification
author_facet Lango Mateusz
author_sort Lango Mateusz
title Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study
title_short Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study
title_full Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study
title_fullStr Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study
title_full_unstemmed Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study
title_sort tackling the problem of class imbalance in multi-class sentiment classification: an experimental study
publisher Sciendo
series Foundations of Computing and Decision Sciences
issn 2300-3405
publishDate 2019-06-01
description Sentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance which often compromises the prediction capabilities of learning algorithms was scarcely studied. In this work, we aim to bridge the gap between imbalanced learning and sentiment analysis. An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets, is carried out in order to analyze the usefulness of imbalanced learning methods for sentiment classification. Moreover, the data difficulty factors — commonly studied in imbalanced learning — are investigated on sentiment corpora to evaluate the impact of class imbalance.
topic sentiment analysis
imbalanced data
multi-class learning
data difficulty factors
text classification
url https://doi.org/10.2478/fcds-2019-0009
work_keys_str_mv AT langomateusz tacklingtheproblemofclassimbalanceinmulticlasssentimentclassificationanexperimentalstudy
_version_ 1717782093186990080