Summary: | In this paper, we propose a data-driven model for predicting the travel speed of urban roads, based on GPS trajectories of vehicles. Though this is a strategically important task in many traffic monitoring systems, the problem has not yet been well-solved given the following two challenges. The first is the effective modeling approach that can predict the urban road travel speed faced with the data sparsity problem, i.e., many road segments may not be traveled by any GPS-equipped vehicles in some time slots. Second, the traffic condition influences the travel speed on a road but it is hard to capture the pattern as it fluctuates irregularly. To address these problems, we propose to utilize the probabilistic principal component analysis-based model to predict the urban road travel speed, which can handle the problem of data sparsity. In addition, to improve the prediction performance of the probabilistic generative model, we incorporate the traffic condition to partition roads into clusters using a spectral clustering method. Implementing prediction on each cluster brings smaller traffic condition variability within clusters and make it capable of parallel computing. We evaluate our proposed method in a case study for the citywide road network of Shanghai, using GPS trajectories generated by over 13000 taxis over a period of one month. Empirical results demonstrate that the model outperforms the competing methods in terms of both effectiveness and robustness.
|