This article was written by Joe Gray
This article was published in Spring 2011 Baseball Research Journal
The most striking facet of the Chicago Cubs’ long-term underachievement has been the team’s lack of World Series success. In early 1909, following the Cubs’ back-to-back triumphs of 1907 and 1908, it would have been unthinkable to all but the most pessimistic of fans that 102 years would elapse without a single additional grand prize. After stripping the strike-shortened 1994 season, which featured no World Series, from this run of disappointment, a 101-season sequence remains with no championship for the North Siders.[fn]At the point that the 1994 season was brought to an abrupt halt, the Cubs were propping up the National League’s newly formed five-team Central Division with a 49–64 record and no realistic hope of a postseason berth.[/fn] During this stretch of 101 deflations, the team has fallen at the final hurdle seven times.[fn]I was disappointed to learn that “101 damnations” had already been used as a pun, including by the Chicago Tribune in late 2009 (thus counting the 1994 season), and so I settled on some wordplay that was weaker but at least original.[/fn] The last of these World Series defeats occurred back in 1945: a 4–3 reverse, at the hands of the Detroit Tigers, represents the only Cubs’ post-1908 championship loss in which the series went the distance.[fn]The Cubs had forced Game Seven in dramatic fashion: the winning run in Game Six was plated on a two-out double from Stan Hack in the bottom of the 12th.[/fn]
But just how improbable is the Cubs’ run of failure in the Fall Classic? This brief article has three aims: first, to present numbers that help to put the 101-year drought into context; second, to explore the extent to which league expansion may have hurt the Cubs; and third, to highlight some important over-riding considerations in addressing problems of this nature.
To simplify the calculations for the purposes of concentrating on the salient points, it is assumed in the first half of this article that all teams have an equal chance of winning the World Series at the beginning of each season. The more realistic scenario—of teams having gradated probabilities of success that fluctuate from season to season—is discussed in the second half of the piece.
HOW NOT TO ANSWER THE QUESTION
A relatively easy, but incorrect, method of quantifying the probability of a run of failure like the Cubs’ would be to calculate the chance that a given team would fail to win the World Series in 101 straight attempts. In order to do this, it would be necessary to multiply together 101 numbers, each one representing the probability of failure in a particular year.[fn]For each year, this would be calculated as: (number of teams – 1) / number of teams.[/fn] Based on the changing league structure that the Cubs have played in since 1909, the chance of a 101-year string of World Series failures calculated by this method is 0.0046, or just over 1 in 220.[fn]Sixteen teams had a shot at the World Series each year between 1909 and 1960, 18 in 1961, 20 between 1962 and 1968, 24 between 1969 and 1976, 26 between 1977 and 1992, and 28 in 1993 and between 1995 and 1997. Thirty franchises have competed since 1998.[/fn] If this number gave us a true indication of the probability of the Cubs’ run, even an ardent skeptic might consider believing in a curse.
So why is this way of estimating the probability incorrect? The problem with the method is that we have specified the team and the years in question after the event. Our calculations must take into account the possibility of any team failing to win a championship in at least 101 straight attempts, beginning in any season between 1903 (the first year of the Fall Classic) and 1909. This is because, in the context of the probability calculations, there is nothing special about the Chicago Cubs or the year 1909.[fn]This thinking could be extended with the argument that nothing is “special” about baseball and that the calculations should take into account other major sports with a history of crowning teams as champions over a period of at least 101 seasons; this article, however, is comfortably rooted in the context of one sport—base- ball—and it is thus seems reasonable to restrict the calculations in this way.[/fn] Had this run of failure been experienced by the Detroit Tigers, the Pittsburgh Pirates, or any other current franchise in existence in the first decade of the 20th Century, I would still be writing this article. And had the drought started in 1906, say, I could have been writing about a 101-year run back in early 2008.
WHAT A BETTER APPROACH LOOKS LIKE
If the method of devising the probability described in the previous section could be adjusted in order to account for multiple possible teams and drought-beginning years, the long-hand calculations would become extremely cumbersome. Thus, a computer model was developed for the purpose of this article.[fn]The software used for the model was Microsoft Excel.[/fn] The model was used to “re-run history” 100,000 times and track the number of iterations in which one or more of the 16 franchises in existence since the first decade of the 20th Century had a run of at least 101 World Series failures starting in 1903, 1905, 1906, 1907, 1908, or 1909 (no World Series was contested in 1904).
Among the simulations carried out, 8.7% featured one or more teams with a run at least as bad as that of the real-life Chicago Cubs. So instead of just over 1 in 220, as calculated by the erroneous method first described, the probability of a “Chicago Cubs,” up to the end of the 2010 season, is a little over 1 in 12. Thus, the persistent failure of the North Siders represents an improbable happening, but not an implausible one.
The simulations took into account actual expansions in team numbers over the history of the World Series. Since more teams now contend for the championship at the start of a season, it is less likely that any given one can emerge victorious. It is possible to quantify the effect that MLB’s expansions have had on the chances of seeing a sequence of sustained failure like that of Wrigley Field’s residents by re-running history with a modified pattern of league size.
Among 100,000 simulations run with no league expansion at all, only 3.0% had one or more teams with a sequence of failures running at least 101 seasons. Thus, the growth in team numbers that has taken place in the Major Leagues appears to have made it approximately three times as likely that there would be a “Chicago Cubs.”
FLUCTUATING AND GRADATED SUCCESS PROBABILITIES AMONG TEAMS
As noted at the start of this brief article, the values calculated above are based on the assumption that all teams had an equal chance of winning the World Series at the start of each season. In order to get a more accurate estimate than the ballpark figure of 8.7% of the improbability of what has unfolded with the Cubs, it would be necessary to build in realistic fluctuations in season-by-season success probabilities across teams. This would include, but not be limited to, periods of relative weakness for expansion teams in their early years and periods of relative strength for one or more “dynasty teams.”
In order to properly incorporate the fluctuations at a season-by-season level, a highly sophisticated model is needed, not least because the probability of success each year is related not only to the probability of success in preceding seasons but also to the actual outcome for each team.[fn]A Markov chain model could be constructed that incorporated these factors in a simulation, but the parameters that guided the fluctuations would need to be carefully researched to ensure that the results were reflective of what we might expect to see in reality.[/fn] (“Success breeds success,” in more concise but hackneyed wording.[fn]The reverse-standings draft order counters this to a certain degree, and further complicates matters.[/fn]) Building such a model would be excessive for the humble purposes of this article, but it is worthwhile to at least test the effect of basic fluctuation patterns and simple periods of sustained weakness and strength.
Dynasty teams. It can be assumed that dynasty teams—most famously, the New York Yankees—are not simply a quirk of random fluctuations, particularly given the relationship between magnitude of financial backing and probability of success. It is thus meaningful to explore the effect of incorporating dynasty effects into the model. One way to do this is to build multipliers, or “Dynasty Factors,” into the success probabilities. For example, Dynasty Factors of 4.0/3.0/2.0 would mean that the best team has four times the success probability of non-dynasty teams, the second-best team has three times the success probability, and the third-best team has double that probability.[fn]In this example, for a 16-team league, the probabilities of winning the championship are approximately 18.2% for the strongest team, 13.6% for the second-strongest team, 9.1% for the third-strongest team, and 4.5% for all other teams.[/fn] In the simplest case, with these example Dynasty Factors of 4.0/3.0/2.0 in effect for the duration of the simulation (i.e. all 106 seasons), the chance of seeing a Cubs-like run grows to 24.3%. Softening the Dynasty Factors to 2.5/2.0/1.5 changes this value to 14.5%. Restoring the Dynasty Factors of 4.0/3.0/2.0 but dividing history into three eras—so that the first three dynasty teams are different from the second trio of dynasty teams, and all six are different from the third trio—yields a value of 15.7%.[fn]In the scenario, the first and last eras were 35 seasons in length and the middle era was 36 seasons.[/fn] Finally, having three-era Dynasty Factors of 2.5/2.0/1.5 gives an output of 10.9%.
Sustained relative weakness of expansion teams. Another variation to the model worth testing is building in a phase of gradual improvement, up to the level of an average team, for expansion franchises. With a 15-year period for expansion teams to reach the success probability of a typical established franchise, the chance of seeing a run like that of the Cubs works out as 7.0%, which is less than basic model’s output of 8.7%. This makes sense, because the only teams in the reckoning for a 101-year drought are the original 16 franchises, and they all benefit by the reductions in the expansion teams’ success probabilities.
Of course, in the one iteration of baseball history that has actually played out—namely, real life—some expansion teams have performed notably well inside the early years of the franchise. The New York Mets won in 1969, year eight, while the Florida Marlins claimed two championships in their first 11 seasons. We cannot be certain whether this is a quirk of history or whether there exists some underlying reason why we should expect this type of phenomenon. If it is the latter, then it that could be incorporated into the model as an alternative to the adjustment described above.
Natural cycles of strength and weakness. Overlaid on any long-term patterns of dynasty-team strength or expansion-team weakness will be the shorter-term cycles of ups and downs experienced by every team. These are a consequence of many factors, including the pulses in farm-system quality that result from the periodic strategy of trading away young talent to gain rapid enhancement of the Major League roster. Setting up the teams in the model with staggered, eight-year cycles of waxing and waning in which there is a 50% increase in success probability—relative to an average team—at the peak of the cycle, and a 50% reduction at the trough of the cycle, yields a chance of seeing a run like that of the Cubs of 8.6%, just fractionally less than the output of the basic model.[fn]For a team beginning the cycle at the peak, the adjustments to success probability—relative to that of an average team— are as follows: in year 1, +50%; year 2, +25%, year 3, no adjust- ment; year 4, -25%; year 5, -50%; year 6, -25%; year 7, no adjustment; year 8, +25%.[/fn] Making the eight-year cycles more extreme by having peak-year adjustments of +100% and trough-year adjustments of –100% brings the value down to 8.1%.
The balance of these effects. Of the various adjustments described in this section, dynasty effects have the greatest potential to modify the output of the model. Therefore, if strong, sustained dynasties are an almost inevitable feature of baseball history, it could well be that the value of 8.7% is something of an underestimate, and that the North Siders’ drought is even less of an anomaly.
Once it is realized that the Chicago Cubs are not a special team, and—to a lesser extent—that 1909 is not a special year, it can be seen that what might at first be considered an implausible happening is merely improbable. I do not know whether this is any consolation for long-suffering Cubs fans, but it might at least offer some reassurance that a curse is not the only possible explanation.[fn]The author does not believe in curses.[/fn]
JOE GRAY, who co-chairs SABR’s Project Cobb Chartered Community, is a British-based baseball statistician with a particular interest in the game’s history. He is author of the 2010 book “What about the Villa? Forgotten figures from Britain’s pro baseball league of 1890” (http://fineleaf.co.uk).