Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case
碩士 === 中國文化大學 === 資訊管理學系 === 101 === Professional baseball games emphasize data collection and analysis because each game provides plenty of data that needs to be analyzed. Data mining methods involve computer analysis techniques with which a crucial outcome can be found from a huge amount of data....
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2013
|
Online Access: | http://ndltd.ncl.edu.tw/handle/13879224135755534063 |
id |
ndltd-TW-100PCCU1396051 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-100PCCU13960512015-10-13T22:18:44Z http://ndltd.ncl.edu.tw/handle/13879224135755534063 Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case 運用資料探勘技術於職棒比賽勝負預測之研究-以美國職棒大聯盟為例 Fong, Ruei-Shiang 馮瑞祥 碩士 中國文化大學 資訊管理學系 101 Professional baseball games emphasize data collection and analysis because each game provides plenty of data that needs to be analyzed. Data mining methods involve computer analysis techniques with which a crucial outcome can be found from a huge amount of data. The data mining techniques thus can be used to efficiently analyze the data of professional baseball and also avoid the mistakes often caused by manual analysis. This study aims to predict the outcome and scores of professional baseball games in MLB. The data of the study are all the regular season games from 2000 to 2012 of thirty teams in MLB. The variables are the average statistics of both the fielders’ and the pitchers’ performances in the last ten games. First, we used the Pearson product-moment correlation coefficient to delete the unrelated variables and variables of multicollinearity and to select the suitable variables. Then we applied the Back Propagation Network (BPN) of the artificial neural network to build a model for the selected variables. The first 100 games served as the training set of the model while the later 62 games as the validation set. After obtaining the predicted scores of each game, we compared them to the real outcome of the games and the money line. After using the output model to predict the scores of the host and the guest, we further compared them with the real outcome, run line, and money line of sports gambling. The experimental results have proven that the model of this study provided better prediction accuracy. Follow-up researchers may consider using different variables for the model to improve the accuracy of the predictions. Hwang, Chein-Shung 黃謙順 2013 學位論文 ; thesis 79 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 中國文化大學 === 資訊管理學系 === 101 === Professional baseball games emphasize data collection and analysis because each game provides plenty of data that needs to be analyzed. Data mining methods involve computer analysis techniques with which a crucial outcome can be found from a huge amount of data. The data mining techniques thus can be used to efficiently analyze the data of professional baseball and also avoid the mistakes often caused by manual analysis. This study aims to predict the outcome and scores of professional baseball games in MLB.
The data of the study are all the regular season games from 2000 to 2012 of thirty teams in MLB. The variables are the average statistics of both the fielders’ and the pitchers’ performances in the last ten games. First, we used the Pearson product-moment correlation coefficient to delete the unrelated variables and variables of multicollinearity and to select the suitable variables. Then we applied the Back Propagation Network (BPN) of the artificial neural network to build a model for the selected variables. The first 100 games served as the training set of the model while the later 62 games as the validation set. After obtaining the predicted scores of each game, we compared them to the real outcome of the games and the money line.
After using the output model to predict the scores of the host and the guest, we further compared them with the real outcome, run line, and money line of sports gambling. The experimental results have proven that the model of this study provided better prediction accuracy. Follow-up researchers may consider using different variables for the model to improve the accuracy of the predictions.
|
author2 |
Hwang, Chein-Shung |
author_facet |
Hwang, Chein-Shung Fong, Ruei-Shiang 馮瑞祥 |
author |
Fong, Ruei-Shiang 馮瑞祥 |
spellingShingle |
Fong, Ruei-Shiang 馮瑞祥 Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case |
author_sort |
Fong, Ruei-Shiang |
title |
Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case |
title_short |
Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case |
title_full |
Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case |
title_fullStr |
Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case |
title_full_unstemmed |
Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case |
title_sort |
studies on predicting the outcome of professional baseball games with data mining techniques: mlb as a case |
publishDate |
2013 |
url |
http://ndltd.ncl.edu.tw/handle/13879224135755534063 |
work_keys_str_mv |
AT fongrueishiang studiesonpredictingtheoutcomeofprofessionalbaseballgameswithdataminingtechniquesmlbasacase AT féngruìxiáng studiesonpredictingtheoutcomeofprofessionalbaseballgameswithdataminingtechniquesmlbasacase AT fongrueishiang yùnyòngzīliàotànkānjìshùyúzhíbàngbǐsàishèngfùyùcèzhīyánjiūyǐměiguózhíbàngdàliánméngwèilì AT féngruìxiáng yùnyòngzīliàotànkānjìshùyúzhíbàngbǐsàishèngfùyùcèzhīyánjiūyǐměiguózhíbàngdàliánméngwèilì |
_version_ |
1718074692575690752 |