Arthur: If we can land a man on the moon, surely I can get good baseball data

From SABR member Robert Arthur at Baseball Prospectus on March 25, 2015:

Going back to Henry Chadwick’s invention of the box score in the 1850s, statistical summaries have been integral to telling stories about baseball. In the latter half of the 19th century, box scores were a way to explain the narrative of a game to an eager public without TV or photographs, in a time when the only access to sports was at the stadium. Now we are bombarded with a multitude of avenues with which to enjoy baseball, but the role of data is fundamentally the same.

To illustrate my point, consider the following scenario: Let’s say you come to me and ask how the game went yesterday because you didn’t get the chance to see it. I could say something like, “The Royals beat the A’s, 9-8.” It might be a factual statement, but it wouldn’t be an especially interesting one. A better way to tell the story would be to explain who played, who scored the runs, and when—the fundamental components of a box score. A still better summary of the game might highlight some of the unexpected happenings, like the way that the Royals exposed Jon Lester’s inability to stop the running game to the tune of seven steals. A yet more rich description of it might wrap the occurrences of the game up into historical narratives and longer-term trends, noting for example that despite nearly matching the single season record for innings caught (and presumably suffering under the burden of tremendous fatigue), Sal Perez was able to knock in the walkoff single in the bottom of the 12th inning. All of these details come from data, and help to transform the rote happenings of sport into a story worth listening to.

In the present day, we are on the verge of a data deluge. Having recorded and preserved nearly every at-bat-level event going back decades, the modern baseball fan is treated to a cornucopia of additional statistics concerning a still finer level of analysis, each individual pitch. The output of PITCHf/x has proven invaluable to writing about and (for me) enjoying baseball, not to mention my own research. Soon, we will track not only the path of the ball, but also its speed off the bat, and perhaps the motions of every player on the field (thanks to Statcast).

Read the full article here:

Originally published: March 25, 2015. Last Updated: March 25, 2015.