A Rating-based Similarity Measure for Recommendation Systems

碩士 === 朝陽科技大學 === 資訊管理系碩士班 === 98 === Recommendation systems (RS) are usually used for handling the information overloading, for recommendation, and for prediction, especially under current Internet environment. According to previous studies, the most commonly applied method for RS is the similarity...

Full description

Bibliographic Details
Main Authors: Pin-Yen Liao, 廖品姸
Other Authors: Li-Hua Li
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/75329902323058503300
Description
Summary:碩士 === 朝陽科技大學 === 資訊管理系碩士班 === 98 === Recommendation systems (RS) are usually used for handling the information overloading, for recommendation, and for prediction, especially under current Internet environment. According to previous studies, the most commonly applied method for RS is the similarity measure. Similarity measure is usually used for finding similarity user or neighbors. To apply similarity measure, one of the commonly approach is to used the user rating, which is also call explicit rating for calculation. The rating difference or distance between the active user and the similar user is used for prediction. Therefore, a good similarity measure can affect the result of RS. It is noticed that the similarity measures such as Pearson Correlation (PCC), Cosine Similarity (COS), Constrained Pearson’s Correlation (CPC), Adjusted Cosine (ACOS), PIP, or Euclidean Distance (ED) are highly used for finding similar users or similar items. These similarity measures usually rely on a user-item matrix in which the explicit ratings are used for calculation. The outcome of recommendation is usually made based on the information of the similar user (item). Hence, the similarity measure for finding the similar user (item) is critical for RS. But, if we examine the traditional similarity measurements, there exists some problems when applying to the RS. (1)They did not take the positive rating and the co-rating count into consideration. (2) When the rating value is equal to the average rating value, the similarity measure of PCC and COS will encounter the problem of division by zero problem, which will cause system failure. (3) The scalability problem is usually not discussed. In order to handle these problems, this research proposes A Rating-based Similarity Measure (RBSM). Our method transforms the explicit-ratings into binary and considers both the positive-rating and co-rating count for finding the similar user. A simple similarity computation is proposed to find the neighborhood. For finding similar neighbor efficiently, users are divided, based on the co-rating amount, into three groups i.e., high, medium, and low. From the experiments, it is proved that our method has better outcome in recall, F1 value, and MAE value if compare to the traditional methods. Our results also show that the proposed method can handle the scalability problem.