Mains: The basics of historical baseball data analysis

From SABR member Rob Mains at Baseball Prospectus on December 24, 2019:

We’re going to be away for the holidays for a few days here at Baseball Prospectus. Maybe you will be too. And maybe you’ll find yourself with nothing to do. Maybe you’ll want to fill your time with figuring out the top ten home run hitters for the Braves franchise. Boy, did you ever come to the right place.

In the first article in this series, I presented the basics of league-wide analysis over the broad arc of baseball history. (tl;dr: You have to adjust for differences in the number of teams and the number of games teams play.) To conclude, let’s look at doing long-term analysis of specific teams or players.

League-wide figures are the same regardless of where you get the data. Once you get to the team and player level, though, the three main sources of historical data—Baseball Prospectus, Baseball-Reference, and FanGraphs—have their quirks.

First, they all identify teams differently. At Baseball Prospectus, we use three-letter Retrosheet abbreviations. Baseball-Reference uses its own three-letter abbreviations. FanGraphs uses team nicknames. For example, Baseball Prospectus calls the American League team in New York NYA. Baseball-Reference says it’s NYY. To FanGraphs, they’re the Yankees.

Read the full article here: https://www.baseballprospectus.com/news/article/56087/the-basics-of-historical-baseball-data-analysis-part-2/



Originally published: December 30, 2019. Last Updated: December 30, 2019.