From SABR member Tom Ruane at Retrosheet.org on May 15, 2019:
Earlier this year, sports-reference.com published a blog by Alex Bonilla about the biggest comeback wins in baseball history, primarily to introduce baseball-reference’s new page that tracks these kinds of things. They used win expectancy and a ton of play-by-play data to populate their list.
For those not familiar with win expectancy, the method calculates the likelihood of a team winning from nearly every game situation. By game situation, I mean it factors in where we are in the game (the top or bottom of each inning), the current situation on the field (all twenty-four possible combinations of runners on base and outs), and how far the team is ahead or behind. It looks like they collapse some of the data (the difference in the current score is capped at eleven runs), and (I’m assuming) all extra-innings are treated the same.
As the post points out, the need for play-by-play data limits the scope of the article to all the games since 1974 and most of those before that back to 1925. This got me to wondering if I could take the basic idea (looking for the most unlikely wins based upon win expectancy) but only considering the odds of winning at the start of each half-inning. Such an approach would allow us to calculate a similar list, but complete back to 1901.
Read the full article here: https://www.retrosheet.org/Research/RuaneT/retro_fun5.htm#A190513
Originally published: May 16, 2019. Last Updated: May 16, 2019.