Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique
Clustering based sentiment analysis confers new directions to analyze real-world opinions without human participation and pre-tagged training data overhead. Clustering based techniques do not rely on linguistic information and more convenient as compared to other traditional machine learning techniq...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Mehran University of Engineering and Technology
2021-07-01
|
Series: | Mehran University Research Journal of Engineering and Technology |
Online Access: | https://publications.muet.edu.pk/index.php/muetrj/article/view/2186 |
id |
doaj-0eb7dff3db2e41589d550427df081bf0 |
---|---|
record_format |
Article |
spelling |
doaj-0eb7dff3db2e41589d550427df081bf02021-07-10T18:10:53ZengMehran University of Engineering and TechnologyMehran University Research Journal of Engineering and Technology0254-78212413-72192021-07-0140363064410.22581/muet1982.2103.162186Sentiment Analysis based on Soft Clustering through Dimensionality Reduction TechniqueSaba Akmal0Hafiz Muhammad Shahzad Asif1Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan.Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan.Clustering based sentiment analysis confers new directions to analyze real-world opinions without human participation and pre-tagged training data overhead. Clustering based techniques do not rely on linguistic information and more convenient as compared to other traditional machine learning techniques. Combining the dimensionality reduction techniques with clustering algorithms highly influence the computational cost and improve the performance of sentiment analysis. In this research, we applied Principal Component Analysis technique to reduce the size of features set. This reduced feature set improves binary K-means clustering results of sentiments analysis. In our experiments, we demonstrate the performance of the clustering system with a reduced feature set to provide high-quality sentiment analysis. However, K-mean clustering has its own limitations such as hard assignment and instability of results. To overcome the limitation of traditional Kmeans algorithm we applied soft clustering (Expectation maximization algorithm) approach which stabilizes clustering accuracy. This approach allows a soft assignment to cluster documents. Consequently, our experimental accuracy is 95% with standard deviation rate of 0.1% which is sufficient to apply the clustering technique in real-world applications.https://publications.muet.edu.pk/index.php/muetrj/article/view/2186 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Saba Akmal Hafiz Muhammad Shahzad Asif |
spellingShingle |
Saba Akmal Hafiz Muhammad Shahzad Asif Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique Mehran University Research Journal of Engineering and Technology |
author_facet |
Saba Akmal Hafiz Muhammad Shahzad Asif |
author_sort |
Saba Akmal |
title |
Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique |
title_short |
Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique |
title_full |
Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique |
title_fullStr |
Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique |
title_full_unstemmed |
Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique |
title_sort |
sentiment analysis based on soft clustering through dimensionality reduction technique |
publisher |
Mehran University of Engineering and Technology |
series |
Mehran University Research Journal of Engineering and Technology |
issn |
0254-7821 2413-7219 |
publishDate |
2021-07-01 |
description |
Clustering based sentiment analysis confers new directions to analyze real-world opinions without human participation and pre-tagged training data overhead. Clustering based techniques do not rely on linguistic information and more convenient as compared to other traditional machine learning techniques. Combining the dimensionality reduction techniques with clustering algorithms highly influence the computational cost and improve the performance of sentiment analysis. In this research, we applied Principal Component Analysis technique to reduce the size of features set. This reduced feature set improves binary K-means clustering results of sentiments analysis. In our experiments, we demonstrate the performance of the clustering system with a reduced feature set to provide high-quality sentiment analysis. However, K-mean clustering has its own limitations such as hard assignment and instability of results. To overcome the limitation of traditional Kmeans algorithm we applied soft clustering (Expectation maximization algorithm) approach which stabilizes clustering accuracy. This approach allows a soft assignment to cluster documents. Consequently, our experimental accuracy is 95% with standard deviation rate of 0.1% which is sufficient to apply the clustering technique in real-world applications. |
url |
https://publications.muet.edu.pk/index.php/muetrj/article/view/2186 |
work_keys_str_mv |
AT sabaakmal sentimentanalysisbasedonsoftclusteringthroughdimensionalityreductiontechnique AT hafizmuhammadshahzadasif sentimentanalysisbasedonsoftclusteringthroughdimensionalityreductiontechnique |
_version_ |
1721309783172055040 |