The study of the influence of predicting models and sampling techniques to ozone prediction

碩士 === 弘光科技大學 === 環境工程研究所 === 99 === In Taiwan, the main species caused severe air pollution problems are O3 and PM10. In this study, the daily maximum hourly ozone concentrations at seven air quality stations were chosen as predicted pollutant. Three predicting models including the multiple linear...

Full description

Bibliographic Details
Main Authors: Wen-Cheng Chen, 陳文程
Other Authors: Hsin-Chung Lu
Format: Others
Language:zh-TW
Online Access:http://ndltd.ncl.edu.tw/handle/40524057138106961314
id ndltd-TW-099HKU05515019
record_format oai_dc
spelling ndltd-TW-099HKU055150192015-10-13T20:09:18Z http://ndltd.ncl.edu.tw/handle/40524057138106961314 The study of the influence of predicting models and sampling techniques to ozone prediction 不同模式及取樣技巧對臭氧濃度預測之影響研究 Wen-Cheng Chen 陳文程 碩士 弘光科技大學 環境工程研究所 99 In Taiwan, the main species caused severe air pollution problems are O3 and PM10. In this study, the daily maximum hourly ozone concentrations at seven air quality stations were chosen as predicted pollutant. Three predicting models including the multiple linear regressions (MLR), generalized additive model (GAM) and multilayer perceptron (MLP) network, and two data input conditions (case I and case II) were utilized to predict the ozone concentrations in this study. In additions, to simulate the large amount of input data size is often time and labors-consuming. Therefore, two data selection methods including random sampling (RS) and Kennard-Stone (K-S) algorithm, were used to select a subset of small amount of samples (n=100, 200, 500, 1000 and 1500) to evaluate the predicting ability of small amount of selected samples. The results showed that the best model is GAM model at Hsinchu, and Ainang stations and MLP model is the best one at other four stations when the whole data size were taken as input data set. The MLR model is almost the worst one at seven stations. In additions, the R2 of seven stations ranges from 0.29 to 0.62 for case I (only meteorological conditions were considered), and the R2 can improve to 0.56~0.73 when variable of O3 concentrations at 10:00 were added to the meteorological conditions (Case II). The largest improvement occurs at Taidong station, the R2 value improves form 0.39 for case I to 0.73 for case II. Considered the influences of different sampling techniques and input data sizes to simulated results, the best simulated result was found when n=1500 and RS technique was used for case I and case II at Cuting station. At Hsinchu station, the best simulated result was achieved when n=200 and K-S algorithm was used for case I, and the predicted result is the best when n=1500 and K-S algorithm was used for case II. At Chungming station, the best results were obtained when n=100, and K-S algorithm was used for both cases. At Ainang station, the best results occurred when using RS (n=1000) and K-S algorithm (n=200) for case I and case II, respectively. At Changjing station, the best simulated results were found when K-S algorithm was used for case I and case II, the respectively sample size are 1500 and 1000. At Yilan station, it was found that the best results were obtained when using K-S algorithm (n=500) and RS technique (n=1500) for case I and case II, respectively. Finally, the best results were achieved using RS technique for case I and II at Taidong station. Both of sampling sizes are 1000 and 1500, respectively. Except the results of Taidong (case I and case II), Hsinchu (case II) and Yilan (case II), the pre-mentioned best simulated results for different sampling sizes and techniques at seven stations are all prior to the results which using the whole data as input data set. Therefore, except the function of reducing the consumptions of time and labors, using small amount of sampled data size as input data sometimes can achieve the better predicted accuracy when the sampling size and technique are appropriate selected. Hsin-Chung Lu 盧信忠 學位論文 ; thesis 255 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 弘光科技大學 === 環境工程研究所 === 99 === In Taiwan, the main species caused severe air pollution problems are O3 and PM10. In this study, the daily maximum hourly ozone concentrations at seven air quality stations were chosen as predicted pollutant. Three predicting models including the multiple linear regressions (MLR), generalized additive model (GAM) and multilayer perceptron (MLP) network, and two data input conditions (case I and case II) were utilized to predict the ozone concentrations in this study. In additions, to simulate the large amount of input data size is often time and labors-consuming. Therefore, two data selection methods including random sampling (RS) and Kennard-Stone (K-S) algorithm, were used to select a subset of small amount of samples (n=100, 200, 500, 1000 and 1500) to evaluate the predicting ability of small amount of selected samples. The results showed that the best model is GAM model at Hsinchu, and Ainang stations and MLP model is the best one at other four stations when the whole data size were taken as input data set. The MLR model is almost the worst one at seven stations. In additions, the R2 of seven stations ranges from 0.29 to 0.62 for case I (only meteorological conditions were considered), and the R2 can improve to 0.56~0.73 when variable of O3 concentrations at 10:00 were added to the meteorological conditions (Case II). The largest improvement occurs at Taidong station, the R2 value improves form 0.39 for case I to 0.73 for case II. Considered the influences of different sampling techniques and input data sizes to simulated results, the best simulated result was found when n=1500 and RS technique was used for case I and case II at Cuting station. At Hsinchu station, the best simulated result was achieved when n=200 and K-S algorithm was used for case I, and the predicted result is the best when n=1500 and K-S algorithm was used for case II. At Chungming station, the best results were obtained when n=100, and K-S algorithm was used for both cases. At Ainang station, the best results occurred when using RS (n=1000) and K-S algorithm (n=200) for case I and case II, respectively. At Changjing station, the best simulated results were found when K-S algorithm was used for case I and case II, the respectively sample size are 1500 and 1000. At Yilan station, it was found that the best results were obtained when using K-S algorithm (n=500) and RS technique (n=1500) for case I and case II, respectively. Finally, the best results were achieved using RS technique for case I and II at Taidong station. Both of sampling sizes are 1000 and 1500, respectively. Except the results of Taidong (case I and case II), Hsinchu (case II) and Yilan (case II), the pre-mentioned best simulated results for different sampling sizes and techniques at seven stations are all prior to the results which using the whole data as input data set. Therefore, except the function of reducing the consumptions of time and labors, using small amount of sampled data size as input data sometimes can achieve the better predicted accuracy when the sampling size and technique are appropriate selected.
author2 Hsin-Chung Lu
author_facet Hsin-Chung Lu
Wen-Cheng Chen
陳文程
author Wen-Cheng Chen
陳文程
spellingShingle Wen-Cheng Chen
陳文程
The study of the influence of predicting models and sampling techniques to ozone prediction
author_sort Wen-Cheng Chen
title The study of the influence of predicting models and sampling techniques to ozone prediction
title_short The study of the influence of predicting models and sampling techniques to ozone prediction
title_full The study of the influence of predicting models and sampling techniques to ozone prediction
title_fullStr The study of the influence of predicting models and sampling techniques to ozone prediction
title_full_unstemmed The study of the influence of predicting models and sampling techniques to ozone prediction
title_sort study of the influence of predicting models and sampling techniques to ozone prediction
url http://ndltd.ncl.edu.tw/handle/40524057138106961314
work_keys_str_mv AT wenchengchen thestudyoftheinfluenceofpredictingmodelsandsamplingtechniquestoozoneprediction
AT chénwénchéng thestudyoftheinfluenceofpredictingmodelsandsamplingtechniquestoozoneprediction
AT wenchengchen bùtóngmóshìjíqǔyàngjìqiǎoduìchòuyǎngnóngdùyùcèzhīyǐngxiǎngyánjiū
AT chénwénchéng bùtóngmóshìjíqǔyàngjìqiǎoduìchòuyǎngnóngdùyùcèzhīyǐngxiǎngyánjiū
AT wenchengchen studyoftheinfluenceofpredictingmodelsandsamplingtechniquestoozoneprediction
AT chénwénchéng studyoftheinfluenceofpredictingmodelsandsamplingtechniquestoozoneprediction
_version_ 1718044376567906304