Summary: | 碩士 === 中國文化大學 === 資訊管理學系 === 106 === Global air pollution is getting worse. Studies have shown that PM2.5 can have a sig-nificant impact on human health due to its tiny size, so how to predict the PM2.5 con-cen-trations is an important issue in control and reduction of pollutions in the air. Data mining techniques have been largely used in predictive analytics on various applications. In this study, we proposed an integrated model to predict PM2.5 concentrations based on time series analysis and several classification models.
We used the data from the Banqiao monitoring station of Taiwan in 2015 as the basis for building the prediction models. Firstly, we used stepwise regression to identify the major factors that influenced the PM2.5 concentrations. Those factors were then used to build the prediction models based on three classification methods including linear, neural network and support vector machines. Finally the predations from the time series model and the classification models were integrated using a linear weighting method.
Experimental results found that the time series prediction model was more suitable for predicting hourly data of PM2.5 concentration. However, the classification model was more suitable for predicting daily data of PM2.5 concentration. The proposed integrated linear weighting model performed the best when the linear and the support vector ma-chine methods were used and had a weighting score of 0.7.
|