2023 SABR Analytics: Watch highlights from Projection Systems Panel
At the SABR Analytics Conference on Friday, March 10, 2023, in Phoenix, Arizona, a panel discussion was held on Baseball Projection Systems.
Panelists included Nicholas Kapur of Zelus Analytics; Jake Schuster of Gemini Sports Analytics; Meg Rowley of FanGraphs; and moderator Ari Kaplan, MLB front office executive.
Here are some highlights from the panel:
On the decisions made in building a projection system
- Kapur: “One of the biggest things to note about projection systems in general is to think about things along the descriptive-predictive scale. There are lots of metrics that are descriptive, including WAR (Wins Above Replacement), that are publicly cited, that are used a lot, right? And what a projection system is doing … they’re trying to strip out noise in such a way to make the metric more predictive. And so that’s the general framework. If there are a couple of tenets to ‘what is a projection system,’ attempting to be as predictive as possible is probably the core tenet.”
- Rowley: “I think that you also have to consider things not only related to the player’s performance, but the player as a physical athlete. You’re trying to think about things like injury, you want to think about how a player is going to age over time, you’re using huge data sets of other comparable players to try and get an understanding of that. It’s going to be imperfect. I think about the biomechanical data that teams have access to and how useful that would be on the public side. Obviously [that’s] not something that we’re gonna be able to incorporate into public projections, but we are trying to do our version of that to understand what is the difference between a guy who’s 21 versus 31 versus 41? We‘ve had a lot of baseball players by now, so we can have some kind of census done on that stuff.”
On modern data stacking
- Schuster: “There’s a lot to do there. … I think recently, there’s been so many advances, and bigger industries are taking advantage a little bit faster. For sports team owners, hiring a lot of people that know this domain has been enough for them, and now they realize that they have to modernize their systems and processes, as well as the technology. So having a cloud and having the right kind of data engineering process and having the right kind of model architecture is going to be very important to get answers fast enough and have people understand it.”
On the challenges of being a public-facing publication with a projection system
- Rowley: “The first assumption that readers tend to make — and sometimes players, for that matter — is that we hate your team. And we don’t! The big challenge that public-facing publications that have a projection system have is the degree of statistical literacy in your readership is going to be really variable. I think that there is a default assumption when a team dramatically outperforms, say, its playoff odds from earlier in the season that the model got something wrong, we made a mistake. And that’s possible. … But often, what will happen when a team goes from having pretty low playoff odds to being in the postseason is that something remarkable happened, or something bad happened to another team. Think about last year’s Guardians. The Guardians’ social media account really enjoyed saying how far they had come from a playoff odds perspective. Players and teams, if they want to make bulletin board material out of public-facing projections, that’s fine. … Maybe something’s wrong with the system, but maybe you just watched something really cool happen. And maybe that’s the way we can think about those teams that go on incredible runs. … I think that our challenge is helping people to understand that, while still having humility around the fact that we should still evaluate our systems, we should make sure that we’re able to articulate what those decisions are so that people can understand them. There’s a lot of value in that, and [there] will be a lot of value in people being able to think more probabilistically not just in baseball, but society-wide. So I think that’s part of our project.”
On the different type of metrics used to evaluate a projection system
- Kapur: “I think that it’s hard to go with any individual metric for evaluating models, and to say that this is a perfect summary of how to evaluate whether my projection system is good or not. The way that I tend to think about it and want to think about it is: What is useful for a client to be able to take and make moves off of? To be able to make transactions off of? For me, thinking about model evaluation from a perspective of ‘what is actually tangibly useful for a decision maker’ is something that lets me think about things within the evaluation sphere differently. For example, you might not want to measure your average error on all players … but you might want to focus a lot of time on players projected in the slightly-above-average to very-above-average (range), and to suss out how to rank those players and what the spread is between them. … In baseball, one of the beautiful things is that we have the ability to think about these problems with real tangible examples that we can recognize right away and identify when things look wrong or look correct. Taking advantage of that from a statistical lens is also super interesting to me.”
On the importance of writing skill in self-branding and job searching
- Schuster: “(Writing) is so important. … I think we haven’t been taught enough about writing as a generation in school. I get about five LinkedIn messages or emails a day asking for a job, and if I can’t figure out in the first three seconds who you are, how you add value, and what you’re asking me for, I won’t read it.”
On the imminent data drift with new rule changes
- Kaplan: “The big changes in rules this year kind of tie into another projection system challenge which is called ‘data drift’ in the industry. The data is changing over time, like maybe the sticky stuff, the ball might be changing, or steroids. Or pitchers, due to the biomechanics revolution, are throwing faster with more spin or more command. Whatever it is, the data or the people or the way the game speed is played, it’s different. … There are so many rules that are changing all at once, and that’s one of the challenges with projections. … When you change four, five things in a row, [the effects are] pretty hard to calculate.”
Transcription assistance from Grace Del Pizzo.
For more coverage of the 2023 SABR Analytics Conference, visit SABR.org/analytics.
Originally published: May 1, 2023. Last Updated: March 27, 2023.