Probabilistic Approaches to Consumer-generated Review Recommendation

Consumer-generated reviews play an important role in online purchase decisions for many consumers. However, the quality and helpfulness of online reviews varies significantly. In addition, the helpfulness of different consumer-generated reviews is not disclosed to consumers unless they carefully ana...

Full description

Bibliographic Details
Main Author: Zhang, Richong
Language:en
Published: 2011
Subjects:
Online Access:http://hdl.handle.net/10393/19935
Description
Summary:Consumer-generated reviews play an important role in online purchase decisions for many consumers. However, the quality and helpfulness of online reviews varies significantly. In addition, the helpfulness of different consumer-generated reviews is not disclosed to consumers unless they carefully analyze the overwhelming number of available contents. Therefore, it is of vital importance to develop predictive models that can evaluate online product reviews efficiently and then display the most useful reviews to consumers, in order to assist them in making purchase decisions. This thesis examines the problem of building computational models for predicting whether a consumer-generated review is helpful based on consumers' online votes on other reviews (where a consumer's vote on a review is either HELPFUL or UNHELPFUL), with the aim of suggesting the most suitable products and vendors to consumers.In particular, we propose in this thesis three different helpfulness prediction approaches for consumer-generated reviews. Our entropy-based approach is relatively simple and suitable for applications requiring simple recommendation engine with fully-voted reviews. However, our entropy-based approach, as well as the existing approaches, lack a general framework and are all limited to utilizing fully-voted reviews. We therefore present a probabilistic helpfulness prediction framework to overcome these limitations. To demonstrate the versatility and flexibility of this framework, we propose an EM-based model and a logistic regression-based model. We show that the EM-based model can utilize reviews voted by a very small number of voters as the training set, and the logistic regression-based model is suitable for real-time helpfulness predicting of consumer-generated reviews. To our best knowledge, this is the first framework for modeling review helpfulness and measuring the goodness of models. Although this thesis primarily considers the problem of review helpfulness prediction, the presented probabilistic methodologies are, in general, applicable for developing recommender systems that make recommendation based on other forms of user-generated contents.