Unsupervised and Supervised Methods to Estimate Temporal-Aware Contradictions in Online Course Reviews

The analysis of user-generated content on the Internet has become increasingly popular for a wide variety of applications. One particular type of content is represented by the user reviews for programs, multimedia, products, and so on. Investigating the opinion contained by reviews may help in follo...

Full description

Bibliographic Details
Main Authors: Badache, I. (Author), Chifu, A.-G (Author), Fournier, S. (Author)
Format: Article
Language:English
Published: MDPI 2022
Subjects:
Online Access:View Fulltext in Publisher
Description
Summary:The analysis of user-generated content on the Internet has become increasingly popular for a wide variety of applications. One particular type of content is represented by the user reviews for programs, multimedia, products, and so on. Investigating the opinion contained by reviews may help in following the evolution of the reviewed items and thus in improving their quality. Detecting contradictory opinions in reviews is crucial when evaluating the quality of the respective resource. This article aims to estimate the contradiction intensity (strength) in the context of online courses (MOOC). This estimation was based on review ratings and on sentiment polarity in the comments, with respect to specific aspects, such as “lecturer”, “presentation”, etc. Between course sessions, users stop reviewing, and also, the course contents may evolve. Thus, the reviews are time dependent, and this is why they should be considered grouped by the course sessions. Having this in mind, the contribution of this paper is threefold: (a) defining the notion of subjective contradiction around specific aspects and then estimating its intensity based on sentiment polarity, review ratings, and temporality; (b) developing a dataset to evaluate the contradiction intensity measure, which was annotated based on a user study; (c) comparing our unsupervised method with supervised methods with automatic feature selection, over the dataset. The dataset collected from coursera.org is in English. It includes 2244 courses and 73,873 user-generated reviews of those courses.The results proved that the standard deviation of the ratings, the standard deviation of the polarities, and the number of reviews are suitable features for predicting the contradiction intensity classes. Among the supervised methods, the J48 decision trees algorithm yielded the best performance, compared to the naive Bayes model and the SVM model. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.
ISBN:22277390 (ISSN)
DOI:10.3390/math10050809