To Cluster, or not to Cluster: The Impact of Clustering on the Performance of Aspect-based Collaborative Filtering

Collaborative filtering (CF) is one of the most widely utilised approaches in recommendation techniques. It suggests items to users based on the ratings of other users who share their preferences. Thus, one of the aims of CF is to find reliable neighbours. Typically, CF produces a sparse user-item r...

Full description

Bibliographic Details
Main Authors: AL-Ghuribi, S.M (Author), Mohammed, M.A (Author), Murshed, B.A.H (Author), Noah, S.A.M (Author), Qasem, S.N (Author)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2023
Subjects:
Online Access:View Fulltext in Publisher
View in Scopus
LEADER 03632nam a2200445Ia 4500
001 10.1109-ACCESS.2023.3270260
008 230529s2023 CNT 000 0 und d
020 |a 21693536 (ISSN) 
245 1 0 |a To Cluster, or not to Cluster: The Impact of Clustering on the Performance of Aspect-based Collaborative Filtering 
260 0 |b Institute of Electrical and Electronics Engineers Inc.  |c 2023 
300 |a 1 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1109/ACCESS.2023.3270260 
856 |z View in Scopus  |u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85159686462&doi=10.1109%2fACCESS.2023.3270260&partnerID=40&md5=5cc0fb96e7c44c9964a5df37531895be 
520 3 |a Collaborative filtering (CF) is one of the most widely utilised approaches in recommendation techniques. It suggests items to users based on the ratings of other users who share their preferences. Thus, one of the aims of CF is to find reliable neighbours. Typically, CF produces a sparse user-item rating matrix, when relying only on the ratings to identify the precise neighbours, resulting in poor performance. User reviews can be essential in overcoming those situations because of the diverse elements available in reviews. The most popular element is aspects, which can provide a fine-grained analysis of users’ behaviours, thus improving personalised recommendations. However, increasing the number of aspects also results in sparsity, therefore may deteriorate the recommendation performance. As a result, clustering of aspects may lessen this sparsity, but it is yet unclear how much this would affect the performance of CF systems. This study proposes a CF approach based on aspect clustering that addresses the above issue in terms of rating prediction. The approach aims to reduce the sparseness in the multi-criteria rating matrix by grouping aspects into clusters based on their semantic similarity, which will be less expensive and require less memory to discover the neighbourhood set. Our approach extracts aspects and represents them using Google’s pre-trained Word2vec model. Then, aspects are organised into clusters using the K-means clustering algorithm. Multi-dimensional Euclidean distance is used as a similarity measure for finding the appropriate neighbours and predicted ratings of unseen items are then made using the kNN algorithm. This study also identifies the number of aspects that significantly impacts CF performance. Experiments are carried out using a real large-scale dataset: the Amazon movie dataset. Evaluation is also performed by comparing CF performance of the proposed approach with three different baseline approaches. Results show that the proposed approach improves CF performance compared to other approaches in terms of three predictive accuracy metrics. Author 
650 0 4 |a Aspect 
650 0 4 |a aspects 
650 0 4 |a Clustering algorithms 
650 0 4 |a Clusterings 
650 0 4 |a collaborative filtering 
650 0 4 |a Collaborative filtering 
650 0 4 |a Computer science 
650 0 4 |a Costs 
650 0 4 |a Euclidean distance 
650 0 4 |a Filtering performance 
650 0 4 |a K-means clustering 
650 0 4 |a K-means++ clustering 
650 0 4 |a matrix 
650 0 4 |a Motion pictures 
650 0 4 |a Performance 
650 0 4 |a Prediction algorithms 
650 0 4 |a Semantics 
650 0 4 |a user reviews 
650 0 4 |a User reviews 
650 0 4 |a Word2vec 
700 1 0 |a AL-Ghuribi, S.M.  |e author 
700 1 0 |a Mohammed, M.A.  |e author 
700 1 0 |a Murshed, B.A.H.  |e author 
700 1 0 |a Noah, S.A.M.  |e author 
700 1 0 |a Qasem, S.N.  |e author 
773 |t IEEE Access