Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case

碩士 === 中國文化大學 === 資訊管理學系 === 101 === Professional baseball games emphasize data collection and analysis because each game provides plenty of data that needs to be analyzed. Data mining methods involve computer analysis techniques with which a crucial outcome can be found from a huge amount of data....

Full description

Bibliographic Details
Main Authors: Fong, Ruei-Shiang, 馮瑞祥
Other Authors: Hwang, Chein-Shung
Format: Others
Language:zh-TW
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/13879224135755534063
id ndltd-TW-100PCCU1396051
record_format oai_dc
spelling ndltd-TW-100PCCU13960512015-10-13T22:18:44Z http://ndltd.ncl.edu.tw/handle/13879224135755534063 Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case 運用資料探勘技術於職棒比賽勝負預測之研究-以美國職棒大聯盟為例 Fong, Ruei-Shiang 馮瑞祥 碩士 中國文化大學 資訊管理學系 101 Professional baseball games emphasize data collection and analysis because each game provides plenty of data that needs to be analyzed. Data mining methods involve computer analysis techniques with which a crucial outcome can be found from a huge amount of data. The data mining techniques thus can be used to efficiently analyze the data of professional baseball and also avoid the mistakes often caused by manual analysis. This study aims to predict the outcome and scores of professional baseball games in MLB. The data of the study are all the regular season games from 2000 to 2012 of thirty teams in MLB. The variables are the average statistics of both the fielders’ and the pitchers’ performances in the last ten games. First, we used the Pearson product-moment correlation coefficient to delete the unrelated variables and variables of multicollinearity and to select the suitable variables. Then we applied the Back Propagation Network (BPN) of the artificial neural network to build a model for the selected variables. The first 100 games served as the training set of the model while the later 62 games as the validation set. After obtaining the predicted scores of each game, we compared them to the real outcome of the games and the money line. After using the output model to predict the scores of the host and the guest, we further compared them with the real outcome, run line, and money line of sports gambling. The experimental results have proven that the model of this study provided better prediction accuracy. Follow-up researchers may consider using different variables for the model to improve the accuracy of the predictions. Hwang, Chein-Shung 黃謙順 2013 學位論文 ; thesis 79 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 中國文化大學 === 資訊管理學系 === 101 === Professional baseball games emphasize data collection and analysis because each game provides plenty of data that needs to be analyzed. Data mining methods involve computer analysis techniques with which a crucial outcome can be found from a huge amount of data. The data mining techniques thus can be used to efficiently analyze the data of professional baseball and also avoid the mistakes often caused by manual analysis. This study aims to predict the outcome and scores of professional baseball games in MLB. The data of the study are all the regular season games from 2000 to 2012 of thirty teams in MLB. The variables are the average statistics of both the fielders’ and the pitchers’ performances in the last ten games. First, we used the Pearson product-moment correlation coefficient to delete the unrelated variables and variables of multicollinearity and to select the suitable variables. Then we applied the Back Propagation Network (BPN) of the artificial neural network to build a model for the selected variables. The first 100 games served as the training set of the model while the later 62 games as the validation set. After obtaining the predicted scores of each game, we compared them to the real outcome of the games and the money line. After using the output model to predict the scores of the host and the guest, we further compared them with the real outcome, run line, and money line of sports gambling. The experimental results have proven that the model of this study provided better prediction accuracy. Follow-up researchers may consider using different variables for the model to improve the accuracy of the predictions.
author2 Hwang, Chein-Shung
author_facet Hwang, Chein-Shung
Fong, Ruei-Shiang
馮瑞祥
author Fong, Ruei-Shiang
馮瑞祥
spellingShingle Fong, Ruei-Shiang
馮瑞祥
Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case
author_sort Fong, Ruei-Shiang
title Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case
title_short Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case
title_full Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case
title_fullStr Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case
title_full_unstemmed Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case
title_sort studies on predicting the outcome of professional baseball games with data mining techniques: mlb as a case
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/13879224135755534063
work_keys_str_mv AT fongrueishiang studiesonpredictingtheoutcomeofprofessionalbaseballgameswithdataminingtechniquesmlbasacase
AT féngruìxiáng studiesonpredictingtheoutcomeofprofessionalbaseballgameswithdataminingtechniquesmlbasacase
AT fongrueishiang yùnyòngzīliàotànkānjìshùyúzhíbàngbǐsàishèngfùyùcèzhīyánjiūyǐměiguózhíbàngdàliánméngwèilì
AT féngruìxiáng yùnyòngzīliàotànkānjìshùyúzhíbàngbǐsàishèngfùyùcèzhīyánjiūyǐměiguózhíbàngdàliánméngwèilì
_version_ 1718074692575690752