Calzada: Deepball: Modeling expectation and uncertainty in baseball with recurrent neural networks

From SABR member Daniel Calzada at Retrosheet.org on November 16, 2018:

Making reliable preseason batter projections for baseball players is an issue of utmost importance to both teams and fans who seek to infer a player’s underlying talent or predict future performance. However, this has proven to be a difficult task due to the high-variance nature of baseball and the lack of abundant,  clean data. For this reason,  current leading models rely mostly upon expert knowledge.  We propose DeepBall, which combines a recurrent neural network with novel regularization and ensemble aggregation. We compare this to Marcel, the industry-standard open-source baseline, and other traditional machine learning techniques, and DeepBall outperforms all. DeepBall is also easily extended to predict multiple years in the future.  In addition to predicting expected performances, we apply standard machine learning techniques to extend DeepBall to model uncertainty in these predictions by estimating the maximum-likelihood distribution over potential outcomes for each player. Due to the positive results, we believe that in the future, DeepBall can be beneficial to both teams and fans in modeling expectation and uncertainty. Finally, we discuss potential extensions to the model and directions of future research.

Read the full article here: https://www.retrosheet.org/Research/Calzada/CALZADA-THESIS-2018.pdf



Originally published: November 19, 2018. Last Updated: November 19, 2018.