Captivated fans around the American League in 1976.

Do Fans Prefer Homegrown Players? An Analysis of MLB Attendance, 1976–2012

April 2, 2014/in Articles.2014-BRJ43-1 /by admin

This article was written by Russell Ormiston

This article was published in Spring 2014 Baseball Research Journal

Since the dawn of free agency, there has been increasing affection paid to players who spend their entire career with the same team. From the ballpark statues of Cal Ripken and Tony Gwynn to the retired numbers of Robin Yount and George Brett, baseball fans in recent years have celebrated star players who rose through the ranks of the home team’s farm system and remained true to that team, city, and fan base over the course of their careers. While these outward expressions of appreciation reflect a narrative suggesting that fans prefer “homegrown” players, the evidence supporting such a hypothesis is strictly anecdotal. After all, can it be said that St. Louis Cardinals fans love Ozzie Smith any less because he spent his first four seasons in San Diego?

The question of whether fans prefer homegrown players can be answered, in part, by examining fluctuations in game attendance in Major League Baseball attributable to the characteristics of the home team’s starting pitcher. If fans prefer homegrown players, this should be reflected in higher attendance in games started by hurlers who are pitching for their original franchise, all else equal. To explore this question, this study examines every game from 1976 to 2012, or the entire post-free agency period in MLB. By building a sample of 81,695 games—one of the largest employed in the literature on baseball attendance—this study has the potential to detect the presence of a statistically significant “homegrown effect” that may be otherwise difficult to identify in smaller samples.

To examine the attendance effects of homegrown pitchers, this study will be divided into four sections. First, this paper will provide a brief review of the relevant literature on the relationship between starting pitcher characteristics and game attendance. Next, this study will describe the data and econometric techniques utilized in the analysis. Following a presentation of the results, this paper will conclude with a discussion regarding the implications of the paper and ideas for future research.

Background

Within the extensive research on Major League Baseball attendance over the past 40 years, within-season analyses have explored a wide array of topics ranging from promotions (McDonald and Rascher, 2000) to interleague play (Butler, 2002).1 Recognizing the potential impact of each game’s starting pitchers, many of these studies have, at the very least, included variables to control for the two pitchers’ respective performance level—typically estimated by career and season wins and losses—and race/ethnicity (e.g., Hill, Madura and Zuber, 1982; Bruggnik and Eaton, 1996; Raschner, 1999; McDonald and Raschner, 2000; Butler, 2002). While the results generally portend a relationship between starting pitcher characteristics and game attendance, the analyses are mixed in regard to the particular variables that are important.

While the studies above were not particularly focused on the association between starting pitchers and game attendance, this relationship represented the fundamental question examined in Ormiston (2012). This paper developed a pair of metrics (described later) to estimate the star power of each team’s starting pitcher. Controlling for each pitcher’s season-weighted wins above replacement (WAR) and a host of other factors, the study demonstrated a positive relationship between the star power of both the home and visiting teams’ starting pitcher and game attendance, an effect that was deemed statistically significant with 99.9 percent confidence. Given that a similar relationship was found between the home team’s starting pitcher’s WAR and attendance, the results of the paper, at minimum, provide nearly unmistakable evidence that pitcher characteristics can significantly influence game attendance.

In regard to a potential relationship between homegrown players and game attendance, the prior literature is conspicuously incomplete. The closest possible study has been that of Yamamura (2011), who analyzed individual game data from the Japanese Central and Pacific Leagues from 2005–7. In order to develop a measure of the star power of each team’s starting pitcher, the study interacted each pitcher’s salary with an indicator variable denoting whether a player was originally from the town in which the game was held. The results demonstrated a positive relationship between attendance and starts of a hometown (but not necessarily homegrown) star pitcher for the home team, but no effect for such a pitcher hurling for the visiting team. While generalizing results from the Japanese Leagues to Major League Baseball is problematic, it does provide some initial evidence suggesting that fans may have a greater attachment to players who they see as “one of their own.”

Data and Model

To examine the relationship between individual game attendance and the homegrown status of each team’s starting pitcher, this study utilizes game log data available at Retrosheet.2 These game logs provide a substantial amount of information on every Major League Baseball contest since 1876, including the game’s date, location and, starting in 1914, game attendance and the names of both starting pitchers. Given that the expiration of the reserve clause has fundamentally altered player movement in baseball, this study analyzes game logs 1976–2012 to isolate any homegrown pitcher attendance effect within MLB’s free agency era. Nearly every game from this period is included in the sample, with the only exceptions being those played at a stadium other than a team’s normal park in a given season and games in which attendance is not available.3 The resulting sample features 81,695 games, one of the largest samples employed in the academic literature on MLB attendance. This is advantageous in that the effect of homegrown players on attendance, if it exists, is hypothesized to be minute and would thus be difficult to detect without an adequately-sized sample.

To analyze the effect of homegrown pitchers on game attendance, this paper proposes the following model (Figure 1), with i and t denoting the home team and season, respectively, and g representing the particular game within a particular it home team’s season.4

One of the defining characteristics of this model is that it utilizes a team-season fixed effects approach, including an indicator variable (Œ±it) to indicate each home team’s season (e.g., a indicator variable that indicates all 81 home games of the 1982 St. Louis Cardinals). These team-season indicator variables are used to capture all game-invariant characteristics of an individual team’s season, including prior years’ success, ticket prices, marketing plans and the home city’s population and economic well-being. As a result, the estimated coefficients on the other variables can be interpreted to represent the attendance fluctuation within a particular team-season attributable to game-variant characteristics.5

The critical variables to this paper are HHOMEGROWNitg and VHOMEGROWNitg, which are indicator variables equaling one if the home and visiting team’s starting pitcher is a “homegrown” player for that respective team. While the primary consideration in this paper is to test for the presence of a “homegrown effect” of the home team’s starting pitcher (i.e., Œ≤1 not equal to 0), the homegrown status of the visiting team’s starting pitcher is also included. While it is hypothesized that this latter effect will be negligible, it is possible that fans identify a player with a certain team (e.g., Andy Pettitte and the New York Yankees) and, thus, value that connection in making a decision whether to attend a specific game.

To formally define a “homegrown” player, this study compares the player’s current team to the organization with which he made his Major League debut.6 This approach is favored over the use of a player’s original organization (via draft or amateur signing) given the hypothesis that fans will most likely see a player as “one of their own” once they appear in their team’s big-league uniform. As an example, while John Smoltz was originally drafted by the Detroit Tigers in 1985, he will likely always be remembered as an Atlanta Braves player given that he made his debut with the club in 1988 and spent the first 21 years of his career in Atlanta. The one limitation to this approach, however, is that it ignores trades made just after a player’s debut; for instance, while Jake Westbrook made three appearances for the 2000 New York Yankees, he was dealt 12 days after his debut to the Cleveland Indians and proceeded to spend parts of the next 11 seasons with that organization. Thus, while Indians fans may consider Westbrook as one of their own, he is identified as a homegrown Yankee in the sample.

A number of control variables included in the model above are of particular importance in isolating the attendance effect attributable to homegrown pitchers. First, this study includes HEXPERIENCEitg and VEXPERIENCEitg. These variables represent the number of years since a player’s Major League debut for the home and visiting teams’ starting pitchers, respectively. Homegrown players are disproportionately those who have yet to reach free agency, and the inclusion of these variables separate fan preferences for young players from that of homegrown players. Second, this study includes two indicator variables, HROOKIEitg and VROOKIEitg, that equal one if the home or visiting team’s starting pitcher, respectively, is a rookie. Anecdotal evidence suggests a potential attendance premium to outstanding rookie pitchers (e.g., Dwight Gooden, Mark Fidrych), thus these variables are important to isolate the homegrown effect from that of any potential rookie effect. For the purposes of this paper, a pitcher is considered a “rookie” if pitching in the year of his debut or the following year. The use of two years allows for a pitcher to receive a spot start or September call-up in one year without removing his rookie designation in the sample the following season (e.g., Fernando Valenzuela in 1980–81).

In order to account for the star power of each team’s starting pitcher, this model includes HSTARitg and VSTARitg, which represent the age-adjusted star power estimates of the home and visiting teams’ starting pitchers, respectively, as defined by Ormiston (2012). This system estimates a player’s relative star power by taking the ratio of a linear sum of a pitcher’s accomplishments—All-Star Game appearances, post-season awards, no-hitters, and other feats—at the time of each start to their “potential experience,” or the difference between the pitcher’s age and 17.7 With scores ranging from 0 (non-star) to 1.25 (superstar), this approach represents the best available, objective system to gauge pitchers’ star power at the time of each start over a long time period as it allows stardom to follow a parabolic pattern over time and meets a priori expectations about the relative star power of pitchers in the free agency era. As an example, this system rates Dwight Gooden, Fernando Valenzuela and Tom Seaver as having reached the highest peaks of stardom in the free agency era, an outcome that seems reasonable on its face.8 Thus, while this approach has its weaknesses, the overwhelmingly strong relationship between pitchers’ star power and game attendance found in Ormiston (2012) necessitates the inclusion of such a variable in order to isolate the attendance effect of the homegrown nature of a pitcher from the attendance influence attributable to his relative stardom.9

To measure a pitcher’s performance, the model includes HWARitg and VWARitg, or the wins above replacement for the home and visiting teams’ starting pitcher in the past year, respectively. While WAR data are available only on a season-by-season basis, using a pitcher’s current-season WAR would introduce considerable endogeneity. As a result, season-weighted WAR, using the prior season and current season, is utilized.10 Ormiston (2012) demonstrated that the home team’s starting pitcher’s WAR had a positive and statistically significant relationship with attendance, suggesting that fans may be reacting to a perceived increase in the probability of a home team’s victory. Conversely, Ormiston (2012) found no relationship between the WAR of the visiting team’s starting pitcher and gameday attendance.11

Beyond the characteristics of each game’s starting pitchers, a number of variables are included to control for other factors that might affect game attendance. The variables SERIES_HSTARitg and SERIES_VSTARitg represent the average age-weighted star power of the other starting pitchers for the home and visiting team, respectively, in a given series; since teams typically play three (or more) games in a row against a given opponent, a fan’s decision to attend a particular game may depend on the star power of the other starting pitchers in a series. To measure the competitive position of the home team, HOVER500itg represents the number of games that the team is over (or under) .500. To capture game uncertainty, OVER500DIFitg represents the difference in the games over .500 between the home and visiting teams. To capture the potential playoff implications of a game, HGBDIVitg and VGBDIVitg denote the number of games back of the home and visiting teams in their respective divisions. To further control for the characteristics of the visiting team, the model includes three lags indicating whether the team was World Series champions (VCHAMPSitg) or made the playoffs (VPLAYOFFSitg) in the last three seasons. To account for the potential rivalry between teams, INTRADIVitg denotes intradivsion games whereas INTERLGitg represents interleague games. Finally, VTEAMitg denotes a series of indicator variables to represent each possible visiting club for that particular game; this is included because some visiting teams (e.g., the New York Yankees, Chicago Cubs) may draw a considerably larger crowd regardless of their on-field success or lack of rivalry with the home team.

Beyond the characteristics of the two teams and the series itself, a number of final controls are added. First, MONTHitg represents six indicator variables denoting the month of the game (combining March with April and September with October). DAYitgxTIMEitg interactions divide games into 1 of 14 categories based on the day of the week and whether the game is a day game or a night game. To account for special situations, OPENERitg denotes a team’s home opener, DHitg controls for a traditional doubleheader, and NEWSTADitg identifies the two teams in the sample that opened a new stadium in the middle of a season (1989 Toronto Blue Jays and 1999 Seattle Mariners).

In terms of an estimation procedure, the use of standard regression modeling (i.e., ordinary least squares) would result in biased coefficients given that sellouts produce right-censored attendance data. In other words, while demand for tickets at a particular game—such as that of a game started by Fernando Valenzuela in 1981—may be sky-high, any estimated impact of a particular variable will be constrained, or censored, by the stadium’s capacity; this results in downward-biased coefficients. As a result, censored-normal fixed effects regression is utilized, featuring sellouts to denote censored observations.12

In the absence of a published list of MLB sellouts, however, the identification of sellouts in the data is a difficult task. While one’s first instinct would be to categorize a game as a sellout when attendance meets or exceeds a stadium’s capacity, this rarely occurs. In fact, of the two longest established sellout streaks in MLB over the last 20 years—the Cleveland Indians (1995–2001) and the Boston Red Sox (2003–13)—the teams combined to meet or exceed stadium capacity fewer than 100 times despite combining for over 1,000 official sellouts. To remedy the absence of sellout data, this study identifies sellouts by whether game attendance represents 90 percent of stadium capacity. This likely leads to erroneously labeling some games as sellouts, however higher thresholds—such as 95 percent—fail to adequately identify a significant number of known sellouts.13 Using the 90 percent threshold, the sellout variable denotes 24 team-seasons in which the home team is considered to have sold out every game.14 Since the censored-normal regression approach considers all observations of these team-seasons as censored data, these team-seasons are excluded from the data, resulting in a revised sample size of 79,751 games.

Results

Before addressing the regression estimates of the attendance model, Table 1 provides a summary of the data. The results demonstrate that homegrown pitchers account for slightly more than half of the games started (50.7 percent) for the home team. Perhaps more importantly, Table 1 suggests that average attendance at games started by non-homegrown pitchers (27,215) is significantly higher than games started by homegrown hurlers (26,398). However, Table 1 also reflects the systematic difference between such pitchers, as non-homegrown pitchers are typically bigger stars and more experienced. As a result, while the summary in Table 1 casts doubt on a homegrown pitcher attendance premium, a more detailed analysis is needed given systematic differences between pitchers and, likely, between the home teams that employ them.

To examine this outcome more carefully, the regression estimates of the censored-normal fixed effects attendance model are presented in Table 2 (See pages 112–13). The results of Model 1 indicate that the homegrown status of a game’s starting pitchers, all else equal, has no statistically significant effect on game attendance. After controlling for pitchers’ star power, season-weighted wins above replacement, experience and rookie status, the results estimate that a homegrown starting pitcher for the home team is expected to decrease attendance by 0.26 percent, however such an effect is not statistically significant at any reasonable confidence level. The homegrown status of the visiting team’s starting pitcher also fails to be statistically significant with at least 95 percent confidence, however it also is negative and larger in magnitude (-0.61 percent). While the results of Model 1 fail to demonstrate a statistically significant relationship between homegrown pitchers and game attendance, other characteristics of the starting hurlers are estimated to be powerful predictors of game attendance. The star power of both starting pitchers strongly influences game attendance, as the effects are positive and statistically significant with 99.9 percent confidence. Given the construction of the age-adjusted star power measure, the coefficient suggests that an additional 0.10 added to a pitcher’s star power total—the equivalent of a 27-year old hurling a no-hitter or being named to an all-star team—is expected to increase attendance by 0.857 percent for the home team’s pitcher and 0.906 percent for the visiting team’s pitcher. Given a crowd of 25,000, these results demonstrate that each additional 0.10 of star power equates to roughly an additional 225 fans, an outcome consistent with the findings of Ormiston (2012).

In addition to the star power of both teams’ starting pitchers, the results of Model 1 demonstrate that the recent performance of the starting pitcher for the home team—but not the visiting team—significantly increases game attendance. The results suggest that, for each additional win above replacement in the past year for the home team’s starting hurler, attendance is expected to increase by 0.47 percent, or an additional 118 fans given a crowd of 25,000. While the effect is statistically significant with 99.9 percent confidence for the home team’s pitcher, the results fail to demonstrate a significant relationship between the visiting team’s starting pitcher’s WAR and game attendance, an outcome possibly due to its deleterious effect on the probability of a home team’s victory. The results of Model 1 also indicate a positive relationship between the experience of both team’s pitchers and game attendance, but the effect is minute—less than 0.1 percent for each additional year in the majors—and fails to be statistically significant with 95 percent confidence. Finally, the results demonstrate that rookie pitchers inspire a modest increase in game attendance; holding all else equal, it is estimated that a rookie pitcher for the home club will boost attendance by 1.11 percent, an effect that is statistically significant with 95 percent confidence. While the effect of a rookie hurler for the visiting club also demonstrates a positive relationship, its statistical significance falls just outside the boundaries of a 95-percent, two-sided confidence test.

Beyond the characteristics of the starting pitchers, the other control variables in the model are of expected sign, reasonable magnitude, and most are statistically significant, an unsurprising result given the large sample used in this study. The results of Model 1 indicate that attendance increases substantially based on the home team’s record—an expected 1.20 percent increase for every game over .500—and improved place in the division standings. The coefficient on the difference in games over .500 between the home and visiting team is negative and statistically significant, suggesting that games featuring a relative mismatch will draw fewer fans compared to a closely matched game. The results of Model 1 demonstrate that fans respond favorably to interleague play, intradivision games, home openers, doubleheaders, recent success by the visiting club, and situations where other star pitchers are starting in a series (possibly indicating important games or more star-laden clubs overall). The coefficients on the visiting team indicator variables are suppressed for space reasons, but the results predictably suggest that the New York Yankees, Los Angeles Dodgers, Chicago Cubs, and Boston Red Sox have the largest estimated positive influence on game attendance as visitors. Finally, attendance is estimated to follow expected patterns in regards to month—peaking June-August—and the respective day and time of a game (weekends produce the highest attendance). Overall, the results of Model 1 match a priori expectations about the determinants of game attendance in Major League Baseball therefore providing credibility to the model used to estimate the attendance effects of the starting pitchers.

Returning to the primary question of this study, the results of Model 1 suggest that homegrown pitchers fail to increase attendance overall. However, there are two concerns with the specification of this model. First, given that homegrown pitchers are disproportionately inexperienced and not established in the majors, the negative coefficients on the homegrown variables may be due, in part, to the effects of multicollinearity; as an example, when the rookie variables are removed from the model, the magnitude of the homegrown coefficients decline precipitously.15 Second, this overall approach ignores the possibility that fans may respond to the homegrown status of only certain types of pitchers; it would be unreasonable to expect that unremarkable, yet homegrown, starting pitchers would inspire an increase in attendance. If an attendance premium does exist for homegrown players, it is likely only amongst hurlers who have connected with a home team’s fan base in such a way—either through their stardom or their longevity—that fans identify such players as “one of their own.” These possibilities, however, are not adequately represented in the specification of Model 1, as it is plausible that the negligible attendance effect of thousands of unremarkable homegrown starting pitchers is drowning out the statistical relationship between the homegrown nature of starting pitchers and game attendance.

To address the possibility that fans only respond to the homegrown status of star pitchers, Model 2 builds upon the first model by adding an interaction between the homegrown variable and the age-adjusted star power of the home and visiting teams’ starting pitchers. This approach tests whether fans have a greater attachment to star pitchers who are playing for their original MLB team when compared to star pitchers who have been acquired from outside the organization, all else equal. The results, however, fail to demonstrate a statistically significant attendance premium attributable to homegrown star pitchers. Model 2 suggests that fans respond to the star power of the home team’s starting pitcher—a 0.755 percent increase in attendance for every 0.10 increase in stardom—with an additional attendance premium of 0.205 percent if the star pitcher is homegrown. While this outcome is seemingly suggestive of a positive “homegrown effect” of star pitchers, the coefficient fails to be statistically significant at any reasonable level; when combined with the negative value of the overall homegrown coefficient (Œ≤ = -0.0039) that also fails to be statistically significant, the results are rather devoid of evidence supporting a homegrown player effect for star players or otherwise. A similar conclusion can be reached when examining the magnitude and lack of statistical significance corresponding to the homegrown variables of the visiting team’s starting pitcher.

To examine the hypothesis that fans develop a stronger attachment to homegrown pitchers who stay with their original MLB team for a prolonged period, Model 3 adds to the previous specification by including an interaction term between the homegrown variable and the number of years since the hurler’s Major League debut. At first glance, the results imply that extended tenure by a homegrown pitcher is predicted to decrease attendance. For non-homegrown starting pitchers, the results suggest that an additional year of experience induces a 0.11 percent increase in game attendance, an effect that is statistically significant with 95 percent confidence. For homegrown starters, an additional year of experience is predicted to decrease attendance by 0.14 percent (0.11 minus 0.25), with both the main and interaction effects on experience being statistically significant with at least 95 percent confidence. This negative relationship between longevity and attendance for homegrown pitchers is partially offset by a positive—albeit not statistically significant—coefficient on the homegrown variable itself (Œ≤ = 0.0067) and a positive and statistically significant coefficient on the homegrown-star power interaction term (Œ≤ = 0.0459).

While a prima facie interpretation of the results of Model 3 suggests that fans may tire of seeing a homegrown pitcher again and again over the course of his career, a deeper analysis into the data casts doubt on this conclusion. First, the deterioration of the rookie coefficients between Models 2 and 3—for both home and visiting team starting pitchers—implies the presence of multicollinearity when the interaction of homegrown status and pitcher experience is included. This is unsurprising given that almost all rookies, by definition of the variables, are homegrown pitchers with one year or less of experience. As such, it is suspected that part of the positive rookie premium found in the first two models is captured in the negative homegrown-experience coefficient (i.e., higher attendance when experience is low); these suspicions are strengthened by the negative and unexpectedly statistically significant coefficient on the homegrown- experience interaction variable for the visiting team’s starting pitcher. In addition, the retrospective nature of the star power variable—pitchers only accumulate “points” after they win an award even if the public views them as a star—may lead to significant underestimation of the star power of pitchers early in their careers. As an example, Mark Fidrych is perceived by this scoring system to have minimal star power (ranging from 0.0 to 0.25) in 1976 despite the fact that during the height of Fidrych’s popularity that season (July 11–September 3), the Detroit Tigers averaged 40,713 fans per home game in the rookie right-hander’s starts and just 18,072 in games started by someone else. If the star power variable is mismeasured, then the positive attendance effects of such young players will be captured in the positive coefficients of the rookie variables and the negative coefficients of the homegrown-experience interactions, especially since most young star pitchers are homegrown.12

To alleviate the specification concerns attributable to young star pitchers, Model 4 in Table 3 re-estimates the attendance model but removes all games started by home team starting pitchers with six or fewer years of experience. Limiting the sample to veteran hurlers and presenting only the relevant variables in the table, the results of Model 4 fail to uncover any statistically significant relationship between the home team’s starting pitcher’s homegrown status and game attendance. While the coefficient on the homegrown variable is negative (Œ≤ = -0.0289), the effect fails to be statistically significant at any reasonable level. Further, the two interactions with homegrown status also fail to be statistically significant at any practical level suggesting that, among veteran pitchers, there is no evidence that fans prefer homegrown pitchers regardless of their star power or experience. While it is possible that there is a threshold effect undetectable with this abbreviated sample—that fans develop a positive attachment to players in their first few years with the club with no appreciable difference in attendance in years beyond that—the results nevertheless cast considerable doubt on the viability of the negative homegrown effects found in Model 3 and suggest that the statistically significant interaction terms in the full-sample analysis are due, in part, to underlying specification issues in the model.

Discussion

While baseball fans have celebrated players who have remained true to one team, city and fan base over the course of their careers, the question of whether fans prefer homegrown players has been left to anecdotal evidence. This study attempts to answer this question by examining fluctuations in game-to-game attendance patterns in Major League Baseball from 1976–2012 that were attributable to the homegrown status of each game’s starting pitchers. Using one of the largest samples of games employed in the academic literature, the results of this paper demonstrate that while fans do respond to certain characteristics of the home team’s starting pitcher, there is reason to be skeptical of the hypothesis that fans actually prefer homegrown pitchers, all else equal.

While the results failed to uncover a persistent, statistically significant relationship in the data, it is nevertheless hoped that this study sparks additional research on how the characteristics of a team’s roster can influence game attendance. For example, one of the stronger results in Models 1 and 2 suggests that, all else equal, rookie pitchers for the home team increase attendance by 1.1 to 1.2 percent (or about 288 fans given an average crowd of 25,000). The magnitude and statistical significance of this effect was surprising. While these results could be due to the specification issues in the model described above, it could also be that this effect is driven by substantial increases in attendance in games started by hyped prospects (e.g., Stephen Strasburg) or a select few rookie pitchers (e.g., Fidrych, Valenzuela, Hideo Nomo) and that most rookie pitchers have little effect otherwise. Future research is encouraged to take a closer look at the potential existence—and distribution—of the attendance effects of rookie hurlers.

In future applications of this study, researchers are cautioned against assuming that the results of this paper imply that fans do not prefer homegrown hitters. It is likely that offensive players represent more constant fixtures on a particular team given that they are generally in the lineup for every game and may be more beloved because of this constancy. Further, the longevity of star pitchers on a given team seems to be far shorter than that of star hitters, a relationship issue that may affect fan attachment and, thus, attendance. For example, of the top 100 pitchers in MLB history ranked by career wins above replacement, only two hurlers who have appeared in a game since 1976—Jim Palmer and Mariano Rivera—played exclusively for one organization throughout their careers. In contrast, of the top 100 hitters ranked by career WAR that played since 1976, 16 players remained with their original organization throughout their careers, including some who are arguably the most beloved player in their respective franchise’s history (e.g., Ripken, Gwynn, Brett).13 Thus, while there is anecdotal evidence supporting fans’ greater appreciation of “loyal” offensive superstars, the lack of similar pitchers over the last 40 years render comparisons between hitters and pitchers to be difficult at best. As such, future research is encouraged to examine fan preferences for homegrown hitters—especially those who have achieved a particular level of stardom and organizational tenure—given that this paper may have limited applicability in addressing the attendance effect of homegrown offensive players.

RUSSELL ORMISTON is an assistant professor of economics at Allegheny College in Meadville, Pennsylvania. He studies sports economics, labor economics and human resource management and can be contacted at rormisto@allegheny.edu.

Sources

Maury Brown, “How Sports Attendance Figures Speak Lies,” Forbes, published on May 25, 2011, www.forbes.com/sites/sportsmoney/2011/05/25/ how-sports-attendance-figures-speak-lies.

Thomas H. Bruggink and James W. Eaton, “Rebuilding Attendance in Major League Baseball: The Demand for Individual Games,” in Baseball Economics: Current Research, ed. John Fizel, Elizabeth Gustafson, and Lawrence Hadley (Westport, CT: Praeger, 1996), 9–31.

Michael R. Butler, “Interleague Play and Baseball Attendance,” Journal of Sports Economics 3 (2002): 320–34.

James Richard Hill, Jeff Madura, and Richard A. Zuber, “The Short-Run Demand for Major League Baseball,” Atlantic Economic Journal 10 (1982): 31–35.

Robert J. Lemke, Matthew Leonard, and Kelebogile Tlhokwane, “Estimating Attendance at Major League Baseball Games for the 2007 Season,” Journal of Sports Economics 11 (2010): 316–48.

Mark McDonald and Daniel A. Rascher, “Does Bat Day Make Cents? The Effect of Promotions on the Demand for Major League Baseball,” Journal of Sport Management 14 (2000): 8–27.

James W. Meehan, Jr., Randy A. Nelson, and Thomas V. Richardson, “Competitive Balance and Game Attendance in Major League Baseball,” Journal of Sports Economics 8 (2007): 563–80.

Russell Ormiston, “Attendance Effects of Star Pitchers in Major League Baseball,” Journal of Sports Economics via OnlineFirst, October 2012, DOI: 10.1177/1527002512461155.

Daniel A. Rascher, “A Test of the Optimal Positive Production Network Externality in Major League Baseball,” in Sports Economics: Current Research, ed. John L. Fizel, Elizabeth Gustafson, and Lawrence Hadley (Westport, CT: Praeger, 1999), 27–45.

Eiji Yamamura, “Game Information, Local Heroes, and Their Effect on Attendance: The Case of the Japanese Baseball League,” Journal of Sports Economics 12 (2011): 20–35.

Notes

1. There is an important methodological distinction between various types of attendance studies. First, some research papers have explored the determinants of season attendance (i.e., across-season analyses) while others examine the influences of game attendance (i.e., within-season analyses). This distinction is important for a number of reasons. First, analyses of season attendance likely suffer from greater omitted variable bias given year-to-year changes in a city’s population, economy, or other social dynamics affecting the region. Second, some determinants of ticket sales may only be detectable in either an across-season (e.g., ticket prices) or within-season (e.g., a fireworks promotion) approach.

2. The Retrosheet database represents the foundation of Baseball-Reference.com, and the two sites represent the standard bearers for data among baseball researchers. The Retrosheet game-by-game database can be found at www.retrosheet.org/gamelogs/index.html.

3. These special cases involve games moved to neutral sites due to inclement weather, temporary stadium construction, or other reasons (e.g., games played outside the US and Canada). This also excludes the “home games” played by the Montreal Expos in San Juan, Puerto Rico.

4. The logarithm of attendance is utilized given the presence of positive, or right, skewness of game attendance.

5. While it is recognized that individual game promotions (e.g., fireworks, giveaways) and within-season variable pricing schemes may influence game attendance, such information is not available on a game-by-game basis over the duration of the years included in the data.

6. In situations where a pitcher is on a club that has moved cities—such as the Expos moving from Montreal to Washington—that player is no longer considered to be “homegrown” since that initial connection between player and fans will be in the former city.

7. In more detail, the numerator of the star power variable equals the linear sum of the number of times a pitcher has been named to the All-Star Game, the number of Cy Young awards won, the number of Most Value Player awards won, the number of no-hitters started, the number of All-Star Game MVP awards, the number of post-season MVP awards, whether the pitcher won the Rookie of the Year and whether the pitcher had won 300 games. The denominator equals a pitcher’s age (as of July 1st of the given year) minus 17.

8. As an example of this system to meet a priori expectations of star power, this method scores the following pitchers as having reached the top 10 highest peaks of star power during the free agency period: Dwight Gooden (1986), Fernando Valenzuela (1982), Tom Seaver (1978), Roger Clemens (2005), Justin Verlander (2012), Randy Johnson (2004), Catfish Hunter (1976), Pedro Martinez (2002), Greg Maddux (1998), and Roy Halladay (2011).

9. One particular weakness of using player awards and accomplishments as a measure of a player’s star power is that they are typically awarded after a player has achieved a particular level of stardom, creating a short-term lag between the public’s likely recognition of a player as a star and when he accrued star “points.” In other words, while Dwight Gooden’s astounding 1984 rookie season likely attracted fans’ attention early in the season, his star score did not register until his appearance in the 1984 All-Star Game.

10. Using only current-season WAR would lead to endogeneity bias, especially in early-season starts; in essence, this would suggest fans would choose to attend a pitcher’s start in April based on his success later in that season. Given the fallacy of that logic and the lack of updated game-to-game WAR values for starting pitchers over the course of an individual season, season-weighted WAR is included. To calculate this, let n represent the percent of the current season already played and t denote the current season. Then, for any given pitcher, season-weighted WAR = nWARt + (1-n)WARt-1. Therefore, in early-season starts (low n), most of a pitcher’s WAR will be reliant on his previous season’s success; in late-season starts (high n), the pitcher’s WAR will become increasingly more dependent on current-season success.

11. To compute the star and wins above replacement data, information on award winners and WAR were drawn from Baseball-Reference.com. Data on no-hitters were located on Retrosheet’s Web site. All-Star Game information was drawn from MLB.com.

12. Censored-normal fixed effects regression has been utilized by a number of papers in the attendance literature, including Meehan, et al. (2007), Lemke, et al. (2010), and Ormiston (2012).

13. As an example of the inadequacy of the 95-percent stadium capacity threshold to identify a sellout, consider the Boston Red Sox’s sellout streak from 2003–12. Of Boston’s 729 official sellouts included in the data during that time, 249 of those games featured attendance figures that fell between 90–95 percent of the capacity of Fenway Park. In contrast, not a single game in that sellout streak featured an official number that fell below 90 percent capacity. For more information on why attendance figures less than 100 percent capacity represent sellouts in Major League Baseball, see Brown (2011).

14. These 24 team-seasons include the Boston Red Sox (2004–12), Chicago Cubs (2004–5, 2008), Cleveland Indians (1996, 1998–2000), Colorado Rockies (1996), Minnesota Twins (2010), Philadelphia Phillies (2010–12), and San Francisco Giants (2010–12). Despite the Cleveland Indians’ known sellout streak from 1996–2001, the 1997 season featured one game (August 12, 1997) where published reports of attendance set it at 32,992, significantly less than the listed stadium capacity and usual attendance.

15. Alternative specifications of the model featured positive and negative estimates for the homegrown coefficient for the home team’s starting pitcher, however this effect was always small in magnitude and was never statistically significant in any specification.

16. Within the full sample, there were 1,012 games started where the home team’s pitcher had six or fewer years of experience and a star power value of 0.3 or above. Homegrown pitchers started 973 of those games (96.1 percent) spanning 39 different hurlers. In contrast, non-homegrown pitchers started just 39 such games (3.9 percent) encompassing four different starters.

17. Those hitters are: Jeff Bagwell, Johnny Bench, Craig Biggio, George Brett, Tony Gwynn, Derek Jeter, Chipper Jones, Barry Larkin, Edgar Martinez, Cal Ripken, Brooks Robinson, Mike Schmidt, Alan Trammell, Lou Whitaker, Carl Yastrzemski, and Robin Yount.

Search the Research Collection

SABR Analytics Conference

Do Fans Prefer Homegrown Players? An Analysis of MLB Attendance, 1976–2012

Support SABR today!