2015 SABR Analytics Conference: Research presentations
Here are the research presentation abstracts and presenter bios for the fourth annual SABR Analytics Conference, which was held March 12-14, 2015, at the Hyatt Regency Phoenix in Phoenix, Arizona.
By popular demand, we had more research presentations (13) on the schedule than ever before in 2015. The full conference schedule is available at SABR.org/analytics/schedule.
4:00-5:00 p.m., Thursday, March 12
RP1 and RP2 took place back-to-back in a single session.
RP1—Vince Gennaro, “What’s Different About Postseason Baseball?”
- Audio: Click here to listen to Vince’s presentation (MP3; 27:29)
Postseason baseball has the characteristics and traits of a “different league” when compared to the game played in the 162 game regular season schedule. Gennaro will highlight the key differences between the two, including the lower run-scoring environment and its implications, as well as the team’s culture and in-game strategies. But the focus of the analysis will be on the impact of a higher quality of pitching and what it means for hitters. How much better is the pitching? What types of hitters fare the best against top quality pitching and are they more likely to perform better in the playoffs? Most hitters performance declines in the postseason. Which hitters are more likely to take their regular season game into the playoffs? Have we created a new definition of “clutch”?
Vince Gennaro is the President of SABR, the director of Columbia University’s sports management graduate program, a consultant to MLB teams, and the host of Behind the Numbers: Baseball SABR Style on SiriusXM on Sunday nights. He is also the author of Diamond Dollars: The Economics of Winning in Baseball and a regular contributor to MLB Network. He is also the architect of the Diamond Dollars Case Competition series, which brings together students and MLB team and league executives and serves as unique learning experience, as well as a networking opportunity for aspiring sports executives. This follows a successful business career, which includes diverse roles — CEO of an early stage public company, president of a billion-dollar division of PepsiCo, and ownership of a women’s pro basketball franchise. He is on the Advisory Board of The Perfect Game Foundation, which is dedicated to helping young people build a career in sports.
RP2—Jason Wilson and Jarvis Greiner, “Pitch Quantification: Using QOP (Quality of Pitch) and GI (Greiner Index) with PITCHf/x data in a 2014 MLB Case Study”
- Audio: Click here to listen to Jason and Jarvis’s presentation (MP3; 33:47)
The standard measurement to determine the quality of a pitch in baseball has always been Miles Per Hour (MPH). However, this is only one of the many variables that determine the effectiveness of a pitch. Horizontal and vertical break, location, trajectory, and rise all contribute to the pitch’s overall value. The ‘Greiner Index’ (GI) is a score derived from these variables that can be used to eliminate subjectivity in comparing pitches beyond their MPH. The GI can then be combined with MPH to determine a ‘Quality of Pitch’ (QOP) rating. Both the GI and the QOP can be used for player evaluation, pitching development, medical monitoring, and fan enjoyment.
Quantitative pitch mapping and statistical linear regression were used to develop a model for calculating the GI on a scale of -10 to +10 for most pitches. The regression model was first developed exclusively for curveballs using NAIA pitching data (A Curveball Index: Quantification of Breaking Balls for Pitchers; CHANCE; 27:3; pp. 34-40). The current model has now been extended to include all pitches. This updated model has been implemented to compile a set of case studies regarding the quality of pitching performances for one MLB team during the 2014 season. The model quality has been validated and is ready for MLB implementation (p-value <2.2 x 10-16; Adj-Rsq = 0.60; maximum VIF=7.1).
This case study will demonstrate how the Greiner Index and Quality of Pitch rating can answer the following questions:
1. Objective comparison between two MLB no-hitters thrown during the 2014 season. Which game featured the better pitching performance?
2. Analyzing one MLB pitcher’s performance during a 2014 regular season game. Were there any indications of a decline in pitch quality that could have determined alternate pitch selection or influence managerial decisions?
3. Identifying pitching patterns that may have lead to an injury during a 2014 MLB game. Could the index have revealed any indication that an injury was imminent and could it have been prevented
4. Evaluation of one MLB pitcher’s performance comparing the regular season to the postseason. Does the index confirm a change in pitch quality between a successful regular season and a disappointing postseason?
5. Analysis of one MLB pitcher’s career consistency vs. various MLB hitters. Does the index reveal why one hitter succeeds and another fails against the same pitcher?
6. Quantifying the performances of past MLB pitchers to objectively determine their ranking. Can the index identify the best MLB pitcher of all time?
Jason Wilson, Ph.D, is an Associate Professor of Mathematics at Biola University where he enjoys teaching and statistical research. He co-authored A Curveball Index: Quantification of Breaking Balls for Pitchers (CHANCE 27:3, 2014, p. 34-40) with Jarvis Greiner. Jarvis Greiner, from Edmonton, Alberta, earned his BA at Biola University and pitched for the varsity baseball team and well as Team Alberta at the 2009 Canada Summer Games and the 2007 Canada Cup.
11:45 a.m.-12:45 p.m., Friday, March 13
RP3 and RP4 took place back-to-back in a single session.
RP3—Graham Goldbeck, “Making a Pitch for Better Command”
- Audio: Click here to listen to Graham’s presentation (MP3; 26:21)
PITCHf/x analysis has led to tremendous baseball insights over the years, but one of the least explored areas of study is pitcher command — too many assumptions surround where the pitcher is trying to throw the ball, with little supporting objective data. However, with the ability to accurately track the catcher’s glove, pitcher command and its effects on the game can be analyzed. COMMANDf/x provides this data, and will be the focus of this presentation. Explored topics will include concepts like the relevance of command relative to other pitching skills, an examination of command as pitchers age, and if good command can predict future success.
Graham Goldbeck is the Manager of Data Analytics and Operations at Sportvision, the company behind PITCHf/x, HITf/x, COMMANDf/x, and FIELDf/x. In the past, Graham was a writer for the website Beyond the Box Score and also worked as a baseball operations intern for the Oakland Athletics and Tampa Bay Rays.
RP4—Martin Rioux, “Using PITCHf/x or Statcast Data to Benchmark Hitter Performance Through Data Envelopment Analysis”
- Audio: Click here to listen to Martin’s presentation (MP3; 30:15)
Data envelopment analysis (DEA) is a linear programming based technique for measuring the relative performance of organizational units where the presence of multiple inputs and outputs makes comparisons difficult. In this particular analysis, organizational units are HITTERS at bat, inputs are 3 PITCHING metrics and outputs are 2 HITTING metrics that originate from PITCHf/x (although the MLBAM Statcast system should provide more powerful metrics in the near future). Combined to statistical analysis, DEA results provide an objective way to evaluate hitter’s relative efficiency to regularly hit the ball “well” given the selection and the location of pitches he has swung at. On the other hand, when a hitter shows some statistically significant “inefficiency”, it is conceivable that the opposite team will try to exploit it in their in-game pitching strategy against him. DEA allows for performance benchmarking between several hitters of interest, or for performance benchmarking of an individual hitter over time (that might help to pinpoint aging, hidden injuries, or bad swing habits issues). A similar DEA approach could evaluate pitchers relative efficiency to have hitters hitting the ball “poorly” against them, given the selection and the location of pitches they have thrown to hitters. Following the release of MLBAM Statcast, MLB decision makers may look to extract in a timely manner new and useful information from that enormous database in order to get any competitive advantages. In that regard, using DEA will facilitate the generation of relevant information about hitters (or pitchers) performances that will help them to take objective decisions about their player personnel or in-game strategy.
Martin Rioux is the founder of IDEALIS, a Canadian-based company that provides objective and impartial performance benchmarking services that appropriately integrate the set of business variables of its customers. This follows an extended business career mainly within Lean Six Sigma consulting firms, Kraft Foods, and Mondelēz International, which includes roles as statistician, Lean Six Sigma Master Black Belt, and Continuous Improvement Associate Director. He owns both a MBA degree specialized in Operations and Decision Aid Systems, and a Bachelor’s degree in Statistics.
4:45-6:15 p.m., Friday, March 13
RP5, RP6, and RP7 took place back-to-back-to-back in a single session.
RP5—Rodney J. Paul, Jeremy Losak, and Justin Mattingly, “The Impact of Length of Game on Major League Baseball Attendance Demand”
- Audio: Click here to listen to Rodney, Jeremy, and Justin’s presentation (MP3; 25:56)
Major League Baseball has made shortening the length of games a priority heading into the 2015 season. Various rule changes, implemented for the Arizona Fall League in 2014 and being considered for Major League Baseball in 2015, are aimed at quickening the pace of the game which should lead to shorter duration of games overall. These rule changes consist of hitters keeping one foot in the batters’ box, pitchers throwing a pitch within twenty seconds, maximum breaks between innings of two minutes and five seconds, two minute and thirty second limits on pitching changes, a maximum of three mound visits per game, and automatic intentional walks.
Our study attempts to understand the rationale behind these rule changes in terms of the link between game duration and fan demand to attend games. Using three seasons of detailed game information (7,290 observations) from www.retrosheet.org (containing information on duration, attendance, etc.) and daily weather conditions gathered from www.weatherunderground.com, we test if duration has any impact on attendance and explore the implications of these results.
We test for the impact of length of game (in minutes) on attendance using linear regression (Ordinary Least Squares) in a pooled framework, correcting for heteroskedasticity and autocorrelation issues as they arise. The dependent variable is individual game attendance for each game in the Major League Baseball season. Our key variable of interest, game duration in minutes, is one of our independent variables. Based upon past research models of baseball attendance (i.e. Baade and Tiehen, 1990; Kahane and Shmanske, 1997; Butler, 2002; Schmidt and Berri, 2002; Paul, Weinbach, and Melvin, 2004; Fort and Lee, 2006), we control for factors such as team success, opponent, day of the week, month of the season, weather, etc. by including them as additional independent variables in the regression model to allow for proper specification. We also test alternative model specifications for game duration, such as runs scored per minute, home runs per minute, and individual umpire effects to ascertain if certain game attributes make game duration more or less desirable to fans.
With this information, we will be able to explain the impact of duration on attendance, forecast the potential improvements in attendance if rule changes alter the duration of games, and ascertain if the current group of fans that attend games are the target of these duration-altering rule changes. If duration does not have an impact on attendance, it can be inferred that the rule changes are aimed at attracting a new audience to the game of baseball and/or being initiated to become more favorable for television. If duration does impact attendance, however, the rule changes are an attempt to bring fans who currently attend some games to turn out in greater numbers for future contests. This research will ultimately shed light on these issues and help plan future potential baseball policy changes in these areas.
Rodney J. Paul has a Ph.D in Applied Economics from Clemson University and is a Professor of Sport Management in the David B. Falk College of Sport and Human Dynamics at Syracuse University. He has published extensively in academic journals on the economics and finance of sport. His work in baseball includes studies of attendance at the major and minor league level, research on uncertainty of outcome and competitive balance in MLB, behavioral biases and market efficiency in baseball wagering markets, and the role of atmospheric conditions on pitcher performance. Jeremy Losak is from the Bronx, New York City, and is currently studying at Syracuse University as a Sport Management Major and Economics Minor. He is the Managing Editor and Lead MLB Reporter at The Sports Quotient, as well as a business data analysis intern with the Auburn Doubledays. Justin Mattingly is a sophomore newspaper and online journalism and political science major at Syracuse University. He is currently employed by the Perfect Game Collegiate Baseball League and has a strong interest in ballpark effects, as he has attended a game in every Major League stadium since 1998. He is the treasurer for the Syracuse University Baseball Statistics and Sabermetrics Club and an Assistant News Editor for The Daily Orange.
RP6—Greg Ackerman, “Why Does a Team Outperform its Run Differential?”
- Audio: Click here to listen to Greg’s presentation (MP3; 10:55)
In today’s analytical age, front offices are looking for the most efficient ways to win games. Given the significance of a win on revenue, teams need to be as informed as possible to maximize their chances to achieve success. One informative metric of interest to teams is Pythagorean wins. Pythagorean wins measures a team’s expected win percentage based on their run differential, illustrating that a team’s record is not necessarily indicative of their true ability to generate and prevent runs. At season’s end, there are often meaningful differences between actual and expected wins. In most cases, it appears that the baseball community believes these differences are a product of luck.
Some writers have mentioned possible reasons for the difference between actual and Pythagorean win percentage. Jared Diamond cites bullpen performance as a reason the 2014 Mets underachieved their Pythagorean win percentage. Brad Vietrogoski introduced the possible impact of managers on win percentage. However, both struggle to quantify the impact of these variables. We believe a multivariate linear regression model may help to explain the differences between these two variables. If successful, managers and front office personnel may be able to utilize this information to achieve greater success for their organization.
There are subtle factors that win or lose games, and by measuring certain parts of a team, we can determine how teams outperform their Pythagorean wins. We expect performances of the bullpen, bench, and managers to be three crucial factors in explaining close wins and losses. These parts of a team are often relied upon in tight games, which tend to skew win percentage because even though they count as wins, they add little to a team’s run differential. In a multivariate regression model, using the difference between actual and Pythagorean wins as the dependent variable, we jointly test a series of independent variables to explain why teams over- or under-perform their expected win total.
Late-inning close games often come down to bullpen performance. The independent variables of interest in our model for the bullpen are ERA, FIP, and K%-BB%, three vital factors to the outcome of most games. A breadth of quality options off the bench increases chances for team success. Skills valuable in late-game situations can be better addressed with a solid bench. We include home runs, stolen bases, and OPS+ of bench players as independent variables in our model. Managerial decisions are often magnified in close games as their choices often result in team victory or defeat. By testing managerial tendencies through pinch hitters and runners used, defensive substitutions, and other measures, we believe that we can quantify how a manager adds to his team’s overall performance and help explain differences between actual and expected wins.
Through our research, we hope to illustrate what factors significantly impact over- or under-performing Pythagorean wins, illustrate the magnitude of each variable, and offer guidance on how these factors may help in roster construction and use.
Greg Ackerman is a senior at Syracuse University, majoring in Sport Management with an Area of Specialization in Sports Analytics. He is the Vice President of the Sabermetrics Club at SU, and his previous research was featured at the 2014 MIT Sloan Sports Analytics Conference.
RP7—Dan Meyer, “Geographic Bias and the Amateur Draft”
- Audio: Click here to listen to Dan’s presentation (MP3; 19:55)
In an age of unprecedented parity in Major League Baseball any edge a team can get over its competitors is magnified. Past research has shown that the single most important thing a team can do to have sustained success is to draft effectively, but are teams doing this?
A preliminary study conducted by Alex Smith and myself presented evidence of persistent geographic biases in the draft. Specifically, we found that players from known baseball powerhouse states continue to perform better than their counterparts from elsewhere in the country. Our initial method looked at the share of players drafted from each state and compared it to the share of players in the Majors from each state. This simple method was a good starting point, but failed to explicitly control for other factors such as draft position.
In this presentation, I intend to address these concerns with new analysis. I have created a new model based off of Sky Andrecheck’s now-famous expected WAR model. The new model includes various indicators for different states to examine this potential inefficiency in a new way. Initial runs through a nonlinear least squares prediction model have shown that this bias still exists when controlling for these factors.
In addition to a more sound research method I intend to delve deeper into possible explanations and causes for this bias. In our initial research we ran visual checks to examine whether or not a lack of scouts in the region was the driving force for the bias, but that did not seem to be the case. Possible reasons that we have considered are that year-round playing and consistent playing time against higher quality competition are undervalued. In this presentation I intend to further explore these issues by incorporating spatial analysis and talking directly with area scouts.
The potential contribution to this field is limitless. Applications of some other current research can at best earn a team a couple of extra wins a season. I believe that if a team executed an improved draft strategy in accordance with this research it could potentially be franchise altering. This research calls for a shift in strategy to take place that will remove some of the luck from the process that is the lifeblood of every major league franchise, the Rule IV Draft.
Dan Meyer is a junior economics and mathematical sciences major at Colby College. He contributes to Beyond the Box Score and The Hardball Times. He is the founder of Colby Baseball Analytics and is a member of the college’s varsity crew team.
9:45-10:45 a.m., Saturday, March 14
RP8 and RP9 took place back-to-back in a single session.
RP8—Ben Jedlovec, “Trajectory-Based Hitting and Pitching Statistics”
- Audio: Click here to listen to Ben’s presentation (MP3; 34:22)
Utilizing batted ball timer and location data similarly to Baseball Info Solutions’ defensive metrics, the company has created models which open the door to true defense-independent hitting and pitching statistics. By removing the defense, ballpark, and other “luck” components, we have created more accurate hitting and pitching evaluations that are much less prone to the small sample variance of results-based hitting and pitching statistics. As a result, this approach re-opens the door to quantifying more subtle changes in performance, including hot and cold streaks, which results-based statistics are not sensitive enough to detect. The results have shown that trajectory-based analysis of batted balls makes significant improvements on results-based pre-season projections, which can carry over to in-season and even daily projections.
Ben Jedlovec is the President of Baseball Info Solutions, the leading baseball data and analytics company. With BIS President & Owner John Dewan, he co-authored The Fielding Bible—Volume III in Spring 2012 and Volume IV in March 2015.
RP9—Allison R. Levin, “Sabermetrics in Practice: Examining Fan Voting for MLB All-Stars over Three Eras”
- Audio: Click here to listen to Allison’s presentation (MP3; 31:27)
John S. Bowman and Joel Zoss observed in “The Pictorial History of Baseball” that baseball is part of the fabric of American culture representing a world of possibilities and chance. Over the long history of baseball every franchise has had its superstar players, but there is only one game where a fan can see all the superstars at once — the All-Star Game — and since 1970 select the starting lineup. As American culture has changed in the last three decades one wonders whether fan voting for All-Star players has changed in accordance.
This paper examines voting during: (1) the early 1990s when the majority of fans only had access to few games per week on television; (2) the early 2000s when most fans had access to at least one game per day and the Web was available to acquire additional information; and (3) the early 2010s when fans have unlimited access to baseball through television, the Web, and social media.
This paper seeks to understand what criteria the fans valued most when selecting All-Stars and how it has changed over time. The author collected partial-year statistics for the All-Star year as well as full year statistics for the previous two years encompassing the top three vote getters at each position for the 1994, 2004, and 2014 games. Three hypotheses are posed: (1) when information about players was limited fans tended to vote on the visibility and popularity of the players; (2) when fans had access to multiple games and nearly unlimited information about players on the Web, fans tended to vote by comparing players; and (3) when Twitter is included fans still lean towards comparing players, but are influenced by player and team tweets.
To test these hypotheses three types of statistics are examined: (1) traditional statistics (i.e., R, H, RBI, and BA), (2) sabermetric statistics (i.e., Rfield, Rpos, RAA, and WAR), and (3) visibility and popularity markers (i.e., team market share, playoff appearances and major awards). For 2014 the number of tweets sent by a player and his team are also collected.
Multiple regression analysis is used to explain the percentage difference in the votes cast for the top two vote getters and the second and third place vote getters. The explanatory variables are similarly transformed. With only twenty-four observations per year, stepwise regression is used to limit the number of explanatory variables. Models that pool the observations for all three-time periods, including time-period categorical variables, are estimated to increase the statistical strength of the analysis.
The results will show what category of statistics — traditional, sabermetric, or visibility — influenced the fan vote during each era. It is anticipated that the results will demonstrate that over the last two decades fans inherently use sabermetrics when evaluating All-Stars. Although the average fan may not realize it, sabermetrics statistics best represent how fans evaluate baseball players. Finally, the analyses will shed light on how social media such as Twitter may impact baseball and baseball analytics in the future.
Allison R. Levin (MA, JD) is an independent scholar and president of Social Network Advisors for Professional Sports. She examines the intersection of economics, technology and sport. In particular, her research focuses on the various ways the increasing use of web technology impacts professional sports.
12:00-1:00 p.m., Saturday, March 14
RP10 and RP11 took place back-to-back in a single session.
RP10—Stephen Loftus, “BXwOBA: A Bayesian Approach to Expected wOBA”
- Audio: Click here to listen to Stephen’s presentation (MP3; 19:37)
In sabermetrics, it’s quite common to try and compute a player’s expected performance in a statistic based on their peripherals. The first examples of such that come to mind are xFIP (Studeman 2005)—adjusting the Fielding Independent Pitching of Tom Tango by normalizing Home Run rate—and xBABIP (Dutton 2008 among others). In 2014, Sam Young began doing research on xwOBA, or expected weighted on-base average, adjusting a highly useful statistic (wOBA, or weighted on-base average) essential to calculating Wins Above Replacement (WAR). In his The Hardball Times Annual 2015 piece, Chris St. John calculated expected run values for hits by binning observations based on hit location and hit velocity.
With these prior examples in mind I propose a combination Bayesian and data mining alternative to calculating an expected wOBA. Initially, suitable hit location data is determined by using a bivariate extension of the Kolmogorov-Smirnov criterion. This subsampling of the data is justifiable because players can be viewed as independent and hit locations are conditionally independent given player. The effect of choosing this data will be integrated out in the Markov Chain Monte Carlo sampler. Then, a fully Bayesian hierarchical model will be defined for an ordinal multinomial model using a multivariate adaptive regression spline basis. This model then puts out probabilities for the results of a ball in play—out, single, double, etc.—which can be used to find an expected wOBA for a hitter.
The results of this model will include two BXwOBA statistics. The first will be a schedule- biased BXwOBA, which will be adjusted for batter opponent defense, but not account for the league-wide baseline. The second BXwOBA will be schedule-neutral, as it adjusts for the league probabilities as a whole. Ideally, this second statistic should be more indicative of true player talent level, and also should be more stable from year to year as it takes opposing defense and schedule changes out of the equation.
Beyond the introduction of the BXwOBA statistic, this presentation should be able to introduce to the sabermetric community more fully the versatility and usefulness of several standard Bayesian techniques. The ordinal multinomial model has been used on very few occasions, and the binomial case of this model is highly useful for dichotomous outcomes, which abound in baseball. The use of spline bases is equally useful, especially in clearly non- linear data, but is also sparingly used. Finally, the Kolmogorov-Smirnov criterion, influenced by my work on Pitcher Similarity Scores, has uses as a screening mechanism or identifier of similarity, and is currently being used as such in biological applications.
Stephen Loftus is a Ph.D student in Statistics at Virginia Tech working with Dr. Leanna House on data mining and Bayesian hierarchical modeling for systems biology and ecological data. His dissertation On the Use of Grouped Covariate Regression in Oversaturated Models, is expected to be completed later in 2015. In his spare time, he is a writer and editor at Beyond the Box Score, with work focusing mostly on opponent-adjusted statistics.
RP11—Anne C. Marx Scheuerell, Brad Smith, and David B. Marx, “Evaluating Offense Productivity in College Baseball”
- Audio: Click here to listen to Anne, Brad, and David’s presentation (MP3; 25:34)
Baseball coaches, players, and fans study statistics in search of ways to better understand the game. One area that has been examined is offensive productivity. “Runs Created” and “Win Shares” are two of the most widely used metrics to evaluate offensive productivity. These metrics are easy to calculate, yet fail to capture information about specific plays. The purpose of this paper is to introduce a new method, The Smith Scoring Index (SSI), provides the ability to quantify each play as well as evaluate the offensive productivity of players. Comparisons of prevalent metrics were included and alternative uses for this method discussed. Results of the analyses indicated a high correlation between the SSI and the dominant metrics, however differences existed due to the quantification of each play. The SSI has been used successfully at the college level in identifying good offensive baseball players, determining the order in a lineup of players, and determining when to steal a base.
Anne C. Marx Scheuerell is an Assistant Professor of Sport Management at Loras College in Dubuque, Iowa. She earned her Master’s Degree from Arizona State University and doctorate degree from the University of Arkansas. David B. Marx is a Professor of Statistics at the University of Nebraska in Lincoln, Nebraska. He is a member of the American Statistical Associations section of Statistics in Sports. Brad Smith (co-author who is unable to attend) is a Ph.D candidate in the department of Educational Psychology at the University of Nebraska specializing in the Quantitative, Qualitative and Psychometric Methods Program. He holds a Bachelor’s and Master’s Degree in Mathematics from Pittsburg State University, and a Master’s Degree in Statistics from the University of Nebraska. He has been a graduate assistant coach for the Nebraska baseball team for the past four years.
3:30-4:30 p.m., Saturday, March 14
RP12 and RP13 took place back-to-back in a single session.
RP12—Scott A. Van Lenten and Frank J. Infurna, “Performance Enhancing Dads? Paternity and Bereavement Leave in Major League Baseball”
- Audio: Click here to listen to Scott and Frank’s presentation (MP3; 24:14)
Beginning in 2011, Major League Baseball (MLB) began formally recognizing paternity leave as an excusable, non-injury related, absence from the field. This was in addition to bereavement, which was already recognized in 2003 as a valid, non-injury related, reason for missing playing time. Although formally observed by MLB, commentators and former players have questioned player’s decisions to miss games, citing a potential lack of dedication to winning and their team. During the 2014 season, paternity leave was in the spotlight again when Daniel Murphy was given a tremendous amount of negative attention for missing playing time after the birth of his child. Given the recent scrutiny, we sought to examine whether taking time off for the birth of a child and bereavement influences performance for position players, quantified as the likelihood of hitting a home run and multiple hit games, and win probability added (WPA) during the course of the season. These questions were based on previous psychological literature demonstrating that parenthood results in increases in life satisfaction and bereavement results in declines in life satisfaction (Dyrdal & Lucas, 2012; Lucas, Clark, Georgellis, & Diener, 2003). Therefore, we hypothesized that parenthood may also increase athletic performance, whereas bereavement would result in declines in performance.
To examine our research questions, we used data from 74 position players (5,628 games) who officially took paternity or bereavement leave during the 2011-2014 MLB seasons as reported on the MLB.com transaction wire (paternity leave: 48; bereavement leave: 26). Using time-series analysis, we examined changes in the likelihood of hitting a home run, having a multiple hit game, and WPA in the month(s) leading up to, days during, and month(s) following paternity or bereavement leave. We found that individuals who took paternity leave had an increased likelihood of hitting a home run during the 10 days surrounding the event. Players, on average, had a 50% increased likelihood of hitting a home run during the 10 days surrounding their paternity leave (p = .02). We did not find that paternity or bereavement leave was associated with an increased likelihood of having a multiple hit game. For WPA, we found that during the 10 days surrounding paternity leave, individuals showed an increase in their cumulative WPA from all other games, which suggests they were more valuable to their teams ability to win during this time. The average WPA during the days surrounding paternity leave was .03 (p = .01), as compared to days outside this range (-.01). Our findings demonstrate that significant life events, specifically the birth of a child, have an influence on performance during the course of the season. This is in contradiction to the commonly held assumption that paternity leave is detrimental to team success. Based on our findings, the public should not denounce players, or athletes in general, for taking time off due to significant life events.
Scott A. Van Lenten is a doctoral candidate in developmental psychology at Arizona State University. Broadly, his research focuses on the mechanisms affecting healthy development during adolescences and early adulthood. Scott is also interested in big data analysis and longitudinal methods for the analysis of change. Scott recently became a first-time father to his son, Foster. This will be his first presentation at a SABR conference. Frank J. Infurna is an Assistant Professor of Psychology at Arizona State University. His research broadly focuses on examining resilience in adulthood and old age, factors that promote healthy aging, and longitudinal research methodology. He played baseball in college, is an avid sports fan, and enjoys applying longitudinal statistical models of change to baseball data by examining the effects of major life stressors (i.e., parenthood and bereavement) on baseball performance.
RP13—Howard M. Wasserman, “An Empirical Analysis of the Infield Fly Rule”
- Audio: Click here to listen to Howard’s presentation (MP3; 27:30)
This paper discusses the results of a five-season (2010-2014) study of the Infield Fly Rule in Major League Baseball, identifying the number and location of every play on which the Rule was invoked. From this, the paper explores three questions: 1) how frequently the Rule is invoked; 2) the likelihood of the feared “unfair” double play if the Rule were repealed; and 3) the effect that repeal might have had on MLB games in this study, considering the number of runs that might be lost and changes in Run Expectancy and Win Expectancy.
Howard M. Wasserman is Professor of Law at Florida International University College of Law. He has published several articles on the infield fly rule, including “The Economics of the Infield Fly Rule,” “Football and the Infield Fly Rule,” and “An Empirical Analysis of the Infield Fly Rule,” which presented the preliminary results of this study. He writes about sports rules at Sports Law Blog and PrawfsBlawg.
For more information on the 2015 SABR Analytics Conference, visit SABR.org/analytics.
Originally published: January 30, 2015. Last Updated: October 11, 2020.