Summary: | This dissertation considers the most important aspects of success in the National Football League (NFL). Success is defined, for this paper, as winning individual games in the short term, and making the playoffs over the course of a season in the long term. Data was collected for 750 different regular season games over the course of five seasons in the NFL, and used to create models that identify those factors which are most significant towards winning at both the short term and long term levels.
A point spread model was developed using an ordinary least squares regression method, and stepwise selection technique to reduce the number of variables included. Logistic regression models were also created to state the probability a team will win an individual game, and also the probability a team will make the playoffs at the end of the season. Discriminant analysis was performed to compare the significant variables in our models, and determine which had the largest influence. We considered the relationship between offense and defense in the NFL to conclude whether or not one area had a significant advantage over the other. We also fit a proportional odds model on the data set to categorize blowout games, and those that are close at the end.
The overwhelming presence of turnover margin, passing efficiency, first down margin, and sack yardage in all of our models is clear evidence that there are a handful of statistics that can explain success in the NFL. Using the statistics from games, we were able to correctly identify the winner around 88% of the time. Finally, we used simulations and historical team performances to forecast future game outcomes, our models classified the actual winner with a 71% accuracy rate.
Analytics are slowly gaining momentum in football, and the advantages are clear. Quantifying success in the NFL can benefit both individual teams, and the league as a whole, to present the best possible product to their audiences.
|