Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique

Clustering based sentiment analysis confers new directions to analyze real-world opinions without human participation and pre-tagged training data overhead. Clustering based techniques do not rely on linguistic information and more convenient as compared to other traditional machine learning techniq...

Full description

Bibliographic Details
Main Authors: Saba Akmal, Hafiz Muhammad Shahzad Asif
Format: Article
Language:English
Published: Mehran University of Engineering and Technology 2021-07-01
Series:Mehran University Research Journal of Engineering and Technology
Online Access:https://publications.muet.edu.pk/index.php/muetrj/article/view/2186
Description
Summary:Clustering based sentiment analysis confers new directions to analyze real-world opinions without human participation and pre-tagged training data overhead. Clustering based techniques do not rely on linguistic information and more convenient as compared to other traditional machine learning techniques. Combining the dimensionality reduction techniques with clustering algorithms highly influence the computational cost and improve the performance of sentiment analysis. In this research, we applied Principal Component Analysis technique to reduce the size of features set. This reduced feature set improves binary K-means clustering results of sentiments analysis. In our experiments, we demonstrate the performance of the clustering system with a reduced feature set to provide high-quality sentiment analysis. However, K-mean clustering has its own limitations such as hard assignment and instability of results. To overcome the limitation of traditional Kmeans algorithm we applied soft clustering (Expectation maximization algorithm) approach which stabilizes clustering accuracy. This approach allows a soft assignment to cluster documents. Consequently, our experimental accuracy is 95% with standard deviation rate of 0.1% which is sufficient to apply the clustering technique in real-world applications.
ISSN:0254-7821
2413-7219