A Hybrid Prediction System for the Seller and the Product Category of Women’s Apparel at Taobao

碩士 === 國立中央大學 === 資訊管理學系 === 102 === In recent years, more and more sellers expand their businesses through E-commerce auction platform. With the ever-growing of E-commerce, it becomes more competitive to do business on Internet. If the customer’s purchase behavior—what to buy and from whom—can be p...

Full description

Bibliographic Details
Main Authors: Chengy-yi Chien, 簡政誼
Other Authors: 何靖遠
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/09295845367594272959
Description
Summary:碩士 === 國立中央大學 === 資訊管理學系 === 102 === In recent years, more and more sellers expand their businesses through E-commerce auction platform. With the ever-growing of E-commerce, it becomes more competitive to do business on Internet. If the customer’s purchase behavior—what to buy and from whom—can be predicted, the seller would be able to retain its customers and increase its revenue in a more cost-effective way. In the literatures we surveyed, classification models like Logistic regression (LR) was hardly used to predict the product category from which a consumer has not yet purchased before. Recommendation system could find out the product preferred by similar customers by combining collaborative filtering and sequential pattern mining (SPM), but it would suffer from the problem of data sparsity. We propose a RFM-based hybrid prediction system by combining the LR model for prediction of product category, and the purchase patterns of most customers using SPM, to establish the probability of purchasing from a particular seller and a particular product category. We target at the largest cross-strait auction platform and the most popular product category, women’s apparel at “Taobao” platform, and has collected the trading records between Jan. 1, 2013 and April 1, 2013 using web mining technology. Firstly, we identify the parameters used in RFM-SPM, and then determine the most appropriate weight used in the Hybrid system. We then use precision, recall, and F1 measures to compare the three prediction systems, RFM-LR, RFM-SPM, and the Hybrid. It is shown that the Hybrid exhibits the highest performance in all three measures in predicting the seller (0.75) and the seller×product category (0.6) among the three prediction systems, while those of RFM-SPM are the lowest. In predicting the purchase behavior of customer clusters, the Hybrid again shows the best performance in terms of F1 measure, which is 0.75~0.82 for low F/high M cluster, and 0.9 for low F/low M cluster.