Summary: | Most existing bike usage prediction studies aim at building models to fit continuous bike usage data rather than categorical data, which may result in an over-fitting problem and therefore reduce the potential of the model to capture more generalized trends in bike usage predictions. This study explores a multi-categorical probabilistic approach for sharing bike demand prediction. In order to overcome the weakness of using single point measurements to describe bike usage conditions, we prepare three alternatives to capture the range, local variation, and trend of bike usage over a short-time period. The suitable indicator variables are determined based on the Principal Component Analysis (PCA) results. The Gaussian Mixture Models (GMM) is adopted to cluster homogeneous bike usage states. Then, a Markov chain model is developed based on the identified states to forecast the categorical changes of bike usage. Finally, to examine the effectiveness of the proposed approach, the persistence model is employed as a benchmark and two measures-Percent Correct (PC) and Heidke Skill Score (HSS), are introduced to quantify categorical data prediction performance. The results show that the proposed approach is able to offer high accuracy, skill, reliability, and discrimination at suitable prediction intervals.
|