Meyer: Predicting game results using run differentials

From Daniel Meyer at Beyond the Box Score on January 7, 2015:

It is never a good idea to bet on baseball, but I would like to propose a rule of thumb for predicting the outcome of a game. I aim to find a simple formula to calculate a win probability given the runs scored and runs allowed values for each team. The motivation for this came when I built my simulation to examine how long of a season is required for the best team to finish with the best record. My simulation did not include a mechanism for teams to play against each other because I didn’t have a good way to have two teams interact given their respective runs scored and runs allowed talent levels. With this formula I hope to provide means for running a quick and dirty calculation to arrive at a win probability to be used in simple simulations or for rough back-of-the-envelope calculations.


For each season from 2004 to 2013 I calculated each teams’ runs scored and runs allowed in the first half of the season to use as predictors for the outcomes of games in the second half of the season. Choosing to use the first half of the season to predict the second half is a tradeoff I had to make. On one hand full year runs scored and runs allowed are going to be more stable (I presume, though this could be a future study), but using run values from a portion of games to predict the outcome of those same games gives rise to bigger problems.

Read the full article here:

Originally published: January 8, 2015. Last Updated: January 8, 2015.