Summary: | This article introduces a novel machine learning approach dedicated to the prediction of the bus arrival times in the bus stations over a given itinerary, based on the so-called Traffic Density Matrix (TDM). The TDM constructs a localized representation of the traffic information in a given urban area that can be specified by the user. We notably show the necessity of disposing of such data for successful, both short-term and long-term prediction objectives, and demonstrate that a global prediction approach cannot be a feasible solution. Several different prediction approaches are then proposed and experimentally evaluated on various simulation scenarios. They include traditional machine learning techniques, such as linear regression and support vector machines (SVM), but also advanced, highly non-linear neural network-based approaches. Within this context, various network architectures are retained and evaluated, including fully connected neural networks (FNN), convolutional neural networks (CNN), recurrent neural networks (RNN) and LSTM (Long Short Term Memory) approaches. The experimental evaluation is carried out under two types of different scenarios, corresponding to both long term and short-term predictions. To this purpose, two different data models are constructed, so-called ODM (Operator Data Model) and CDM (Client Data Model), respectively dedicated to long term and short-term predictions. The experimental results obtained show that increasing the degree of non-linearity of the predictors is highly benefic for the accuracy of the obtained predictions. They also show that significant improvements can be achieved over state of the art techniques. In the case of long-term prediction, the FNN method performs the best when compared with the baseline OLS technique, with a significant increase in accuracy (more than 66%). For short-term prediction, the FNN method is also the best performer, with more than 15% of gain in accuracy with respect to OLS.
|