This article was written by Trent McCotter
This article was published in Fall 2010 Baseball Research Journal
Do hitting streaks occur more frequently than they would if hitting was random from game to game?
In a previous article in By the Numbers, Jim Albert found that there was no significant difference in the expected and actual number of hitting streaks over individual seasons. Here the author argues that, when you aggregate all the single-season data, the result is statistically significant and constitutes valid evidence that hitting streaks are indeed more frequent than expected.
In the 2008 edition of The Baseball Research Journal, I published an article showing evidence that hitting streaks in baseball occur significantly more frequently than they would occur if hitting was random from game to game. I used the random permutation method to determine whether the number of hitting streaks (of lengths 5+, 10+, 15+, and 20+ games) matched what an IID (independent and identically-distributed) model would look like. It turned out that it did not. Later, in the November, 2008 issue of By the Numbers, Jim Albert analyzed the seasons from 2004 to 2008 using the same method that I used, but taking the seasons individually. Jim found high pvalues for most numbers; that is, the number of streaks in real life wasn’t significantly higher than a random permutation would produce.
I have two issues with Jim’s analysis and results. First, his results still show a tendency for there to be more hitting streaks in real-life than we’d “expect” using a random- permutation method—even at the single-season level. Out of the 20 matched-pairs that Jim generated (five years of data, with four different lengths of hitting streak for each year), 15 of those pairs had a higher value for the “real-life” streak total than for the average over the permutations. And the other 5 (where the real-life total was less than the average over the permutations) were pretty close to being even. So I’d say that—even at the single-season level—there is evidence that hitting streaks of pretty much every length are more likely to occur in real life than if the games were randomly permuted.
Second, even if Albert’s results didn’t show a tendency for there to be more streaks in real life than over a random permutation, I’d still have a major qualm with his method of trying to show that there is little difference between streak totals in real life versus the permutations. The qualm is that Albert split the 50 years of data that I used into single seasons and then said that there wasn’t much significance at a singleseason level. But that would be the case with almost every study. The entire purpose of conglomerating 50 seasons’ worth of data is to find trends that might not be as obvious at a single-season level (although, per point 1 above, I think there actually is evidence that shows some significance at the single-season level).
If we look at each season individually and say that maybe there’s a slight trend toward more hitting streaks, that wouldn’t mean much; but if almost every season showed the exact same trend, then it would be very meaningful. In other words, the entire purpose of larger sample sizes is to smoke out trends that might not be apparent on an individual sample-by-sample basis; but if almost every sample tends to show the same pattern, then we probably have something significant going on. Of course it makes sense that—in any given season—there might not be that much evidence of a trend; the trend only becomes obvious when viewed from afar, when all the seasons are added together and their similar patterns become magnified.
A version of this article appeared in By the Numbers 19, no. 3 (August 2009): 1.
Jim Albert Responds
Sabermetrics research consists of posing a good question, collecting the relevant data, and exploring the data to answer the question. In the study of streakiness, there are different questions one can pose. McCotter asks the question: Can batting results (Hit or Out) be represented by a model where individual outcomes are independent and identically distributed (the IID model)? Another question would be: Is there evidence of significant streaky hitting ability among baseball players? A third question would be: Can we classify hitters into the two types “streaky” and “non-streaky”?
Most baseball fans and statisticians know the answer to McCotter’s question. Batting results for a single player are not independent and identically distributed. So what is the point of McCotter’s analysis that shows, on the basis of 50 years of data, that batting outcomes don’t follow the IID model? Actually, his analysis says little since we already knew that the IID assumption is false.
I think it makes much more sense to ask a more interesting question where the answer is uncertain. If batters possess an ability to be streaky, what is the size of this streaky effect, and can we describe the characteristics of hitters who are “truly streaky”? To begin to answer this question, I believe that one has to check if there is an unusual streaky pattern of performance for individual seasons. If a pattern of unusual streakiness of hitters is not obvious for individual seasons, then it would seem that the size of the streaky effect is small. In my analysis of the seasons 2004 through 2008, I found that the streaky patterns were consistent with the IID model for two of the five seasons. This tells me that the size of the true streaky effects is generally small, and that conclusion is consistent with my earlier research on streakiness. It is difficult to find players who are consistently streaky from season to season, and so it is hard to separate players into the “streaky” and “non-streaky” groups.
Statisticians are not concerned about “exact” models. Instead they wish to find approximate models that are useful in understanding the main features of a dataset. The IID model is wrong as McCotter finds, but that’s okay. Since the true streaky effects appear to be small in magnitude, one can make excellent predictions about observed streaky behavior from the IID model. For example, I believe the IID model would provide good predictions of the number of hitting streaks that exceed 10 games during the 2010 season. As I have said before, I think it is remarkable how good simple models like the IID models can be in predicting patterns of baseball hitting performance.
Trent McCotter Responds
I’ll briefly respond to Jim Albert’s rebuttal. First, nobody “knew that the IID assumption was false” until my article was published in The Baseball Research Journal. Perhaps some of us suspected it, but it was not something “known” or assumed. In fact, it was the exact opposite: every single article that has tried to calculate probabilities of streaks in baseball relied on the IID assumption to be TRUE. If we all knew that it was false, then there sure were a lot of people who decided to write papers based on an assumption they already knew was wrong. Confirming that games are not randomly distributed is a big deal: after reading my paper, Steve Strogatz (a well-known lecturer and ‘stats guru’ at Cornell) canceled a huge simulation project he was running on 56-game hitting streaks. Why? If we can’t use standard probability assumptions, then we just can’t calculate probabilities. At least not meaningful ones.
Second, I agree that the effect of this “false assumption” seems small at a single-season level. But when we look at a 50-year stretch, we see that there have been many more hitting streaks than there should have been. We have seen 43 percent more twenty-game hitting streaks—and 171 percent more thirty-game hitting streaks—than we should have. Surely these differences are not “small in magnitude” and can be ignored, as Albert proposes. That the effect is small on a single-season level doesn’t mean that the effect is trivial; it just means it’s not monumental. Long hitting streaks are rare. We can’t measure them on a season-by-season basis. We must measure them over decades.
Baseball is a game of inches: Small changes can make a big difference. And the shift caused by this false assumption about how games are distributed is not something we can ignore when we calculate probabilities of streaks in baseball.
TRENT McCOTTER is a law student at the University of North Carolina at Chapel Hill.