Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique

Clustering based sentiment analysis confers new directions to analyze real-world opinions without human participation and pre-tagged training data overhead. Clustering based techniques do not rely on linguistic information and more convenient as compared to other traditional machine learning techniq...

Full description

Bibliographic Details
Main Authors: Saba Akmal, Hafiz Muhammad Shahzad Asif
Format: Article
Language:English
Published: Mehran University of Engineering and Technology 2021-07-01
Series:Mehran University Research Journal of Engineering and Technology
Online Access:https://publications.muet.edu.pk/index.php/muetrj/article/view/2186
id doaj-0eb7dff3db2e41589d550427df081bf0
record_format Article
spelling doaj-0eb7dff3db2e41589d550427df081bf02021-07-10T18:10:53ZengMehran University of Engineering and TechnologyMehran University Research Journal of Engineering and Technology0254-78212413-72192021-07-0140363064410.22581/muet1982.2103.162186Sentiment Analysis based on Soft Clustering through Dimensionality Reduction TechniqueSaba Akmal0Hafiz Muhammad Shahzad Asif1Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan.Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan.Clustering based sentiment analysis confers new directions to analyze real-world opinions without human participation and pre-tagged training data overhead. Clustering based techniques do not rely on linguistic information and more convenient as compared to other traditional machine learning techniques. Combining the dimensionality reduction techniques with clustering algorithms highly influence the computational cost and improve the performance of sentiment analysis. In this research, we applied Principal Component Analysis technique to reduce the size of features set. This reduced feature set improves binary K-means clustering results of sentiments analysis. In our experiments, we demonstrate the performance of the clustering system with a reduced feature set to provide high-quality sentiment analysis. However, K-mean clustering has its own limitations such as hard assignment and instability of results. To overcome the limitation of traditional Kmeans algorithm we applied soft clustering (Expectation maximization algorithm) approach which stabilizes clustering accuracy. This approach allows a soft assignment to cluster documents. Consequently, our experimental accuracy is 95% with standard deviation rate of 0.1% which is sufficient to apply the clustering technique in real-world applications.https://publications.muet.edu.pk/index.php/muetrj/article/view/2186
collection DOAJ
language English
format Article
sources DOAJ
author Saba Akmal
Hafiz Muhammad Shahzad Asif
spellingShingle Saba Akmal
Hafiz Muhammad Shahzad Asif
Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique
Mehran University Research Journal of Engineering and Technology
author_facet Saba Akmal
Hafiz Muhammad Shahzad Asif
author_sort Saba Akmal
title Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique
title_short Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique
title_full Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique
title_fullStr Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique
title_full_unstemmed Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique
title_sort sentiment analysis based on soft clustering through dimensionality reduction technique
publisher Mehran University of Engineering and Technology
series Mehran University Research Journal of Engineering and Technology
issn 0254-7821
2413-7219
publishDate 2021-07-01
description Clustering based sentiment analysis confers new directions to analyze real-world opinions without human participation and pre-tagged training data overhead. Clustering based techniques do not rely on linguistic information and more convenient as compared to other traditional machine learning techniques. Combining the dimensionality reduction techniques with clustering algorithms highly influence the computational cost and improve the performance of sentiment analysis. In this research, we applied Principal Component Analysis technique to reduce the size of features set. This reduced feature set improves binary K-means clustering results of sentiments analysis. In our experiments, we demonstrate the performance of the clustering system with a reduced feature set to provide high-quality sentiment analysis. However, K-mean clustering has its own limitations such as hard assignment and instability of results. To overcome the limitation of traditional Kmeans algorithm we applied soft clustering (Expectation maximization algorithm) approach which stabilizes clustering accuracy. This approach allows a soft assignment to cluster documents. Consequently, our experimental accuracy is 95% with standard deviation rate of 0.1% which is sufficient to apply the clustering technique in real-world applications.
url https://publications.muet.edu.pk/index.php/muetrj/article/view/2186
work_keys_str_mv AT sabaakmal sentimentanalysisbasedonsoftclusteringthroughdimensionalityreductiontechnique
AT hafizmuhammadshahzadasif sentimentanalysisbasedonsoftclusteringthroughdimensionalityreductiontechnique
_version_ 1721309783172055040