Brad Lidge (Trading Card DB)

Pitch Perfect: Re-examining Brad Lidge’s Performance in 2008 Using Win Probabilities Added and Leverage Index

This article was written by Jim Sweetman

This article was published in The National Pastime: From Swampoodle to South Philly (Philadelphia, 2013)

Brad Lidge (Trading Card DB)When Brad Lidge announced his retirement late in 2012, some commentators pointed to the home run he gave up to Albert Pujols in the 2005 NLCS as the defining moment of his career. To Phillies fans, though, the image of Lidge on his knees, arms raised in triumph after recording the final out of the 2008 World Series defines that magical season, if not Lidge’s career as a Phillie. The association of this image to fond memories of 2008 is even more appropriate when one considers Lidge’s overall contributions that year: a perfect record in 41 regular-season and seven postseason save opportunities. He became only the second player in major-league history to record at least 40 saves in a single season without blowing an opportunity.1

And yet, while Lidge was perfect in save opportunities, he was not completely perfect. On July 25, he entered the game in the ninth with the Phillies down, 1–0, and gave up five runs to the Braves without recording an out. A month later he gave up two earned runs to the Mets in a game he entered with the score tied. He gave up single runs in six games in which he earned a save, thanks to a cushion provided by the potent Phillies offense. Surely, no full-time reliever has ever had a completely perfect season, and these few blemishes offer only the slightest hint at the decline in performance Lidge would experience in subsequent seasons. However, they do raise the question of how much better Lidge’s performance was than other championship Phillies closers.

For the purposes of this exercise, we define a championship closer as a pitcher who earned half or more of the Phillies’ saves in a year in which they won at least a division title since the adoption of the save as an official statistic in 1969: Ron Reed (1978), Tug McGraw (1980), Al Holland (1983), Mitch Williams (1993), Brett Myers (2007), Ryan Madson (2011), and Lidge himself in 2008, 2009 and 2010.2 For additional context we’ve added Steve Bedrosian, who earned a Cy Young Award as Phillies closer in 1987, becoming the first pitcher in team history to record 40 saves in one season.

Using traditional statistics, Lidge’s 2008 season stacks up very well against the comparison group (see Table 1). His ERA is second only to McGraw’s in 1980 and more than a run better than the group as a whole. He won only two games, but that was two more than he lost; the rest of the group averaged roughly even win-loss totals. Also, while Lidge never took a loss or blown save in 2008, on average, the others in the championship closers group failed to close out games left in their hands through a loss or blown save roughly nine times per year.



Player Season W L S BS ERA
Reed 1978 3 4 17 2 2.24
McGraw 1980 5 4 20 5 1.46
Holland 1983 8 4 25 7 2.26
Bedrosian 1987 5 3 40 8 2.83
Williams 1993 3 7 43 6 3.34
Myers* 2007 5 5 21 3 2.87
Lidge 2008 2 0 41 0 1.95
Lidge 2009 0 8 31 11 7.21
Lidge 2010 1 1 27 5 2.96
Madson 2011 4 2 32 2 2.37
Average (ex 2008) 3.8 4.2 28.4 5.4 3.05

* Throughout this paper, Myers’ stats reflect only relief appearances and exclude his 3 starts in 2007.
Source: Analysis of data from


However, traditional statistics have well-known limitations in measuring the performance of relief pitchers. For example, ERA does not capture a pitcher’s ability to keep inherited runners from scoring. Wins can be added and losses avoided when a pitcher is backed by an offense that has the ability to win games with late runs. A save can be awarded to a pitcher facing a single batter with a one-run lead, or one allowing two runs when entering with a three-run lead. This partly explains McGraw’s average number of losses and blown saves despite a significantly better-than-average ERA.

These statistics can also have limited utility in comparing pitchers if there are significant differences in how the pitchers were used, a factor that has varied over time. In the period covered by our sample, Phillies closers were used for increasingly shorter appearances. Whereas Reed and McGraw averaged almost 1 2/3 innings per appearance, Holland and Bedrosian averaged about an inning and a third, while Williams and most of his twenty-first century counterparts averaged just under an inning per appearance (see Table 2). This trend was not limited to the Phillies: The top 15 closers in saves in 1980 averaged 1.58 innings pitched per appearance, while the top 15 in 2009 averaged 0.98.



Player Season G IP IP/G
Reed 1978 66 108.7 1.65
McGraw 1980 57 92.3 1.62
Holland 1983 68 91.7 1.35
Bedrosian 1987 65 89.0 1.37
Williams 1993 65 62.0 0.95
Myers* 2007 48 53.3 1.11
Lidge 2008 72 69.3 0.96
Lidge 2009 67 58.7 0.88
Lidge 2010 50 45.7 0.91
Madson 2011 62 60.7 0.98

*relief appearances only
Source: Analysis of data from


Tug McGraw (Trading Card DB)In addition to differences in how closers were used within games, the number of games they appeared in varied, in part due to missed time during the season or changing roles within the pitching staff. McGraw injured himself trying to learn a side-arm pitch and went on the 21-day DL in the middle of the 1980 [source: TSN, 7-19-80, p. 27]. Holland opened the 1983 season on the DL with a sore shoulder [source: TSN, 4-18-83, p. 14]. In 2007, Myers was the Opening Day starter, but made only two more starts before being moved to the bullpen, first as a set-up man, then as closer to replace the injured Tom Gordon [source: PDN, 4-19-2007, p. 94 & 5-3-2007 p. 88]. He subsequently got injured late in May [source: Phila. Inquirer, 5-26-2007, p. E1] and then returned as closer in July. Lidge missed time with a sprained knee in 2009 and opened the 2010 season on the DL after elbow surgery [source: transactions]. Ryan Madson started 2011 as a set-up man for Jose Contreras, then took over as closer in late April [source: Phila Inquirer 4-25-2011 p E12] before missing a month with a hand injury [source: PDN 7-16-2011 p. 29].

More recently adopted statistics can control for some of these differences by normalizing performance measures on a per-inning or per-batter basis. For example, WHIP (walks plus hits per inning pitched) and OBP (opponents’ on-base percentage) both aim to measure how well pitchers perform their primary task, keeping batters off base. Using these measures, Lidge’s 2008 season looks about average: not as good as McGraw or Holland, but not nearly as bad as the notorious tightrope walkers like Williams or the 2009 version of Lidge (see Table 3). Instead of helping to explain Lidge’s success in 2008, these stats instead raise a new question: How could he have been so successful at closing out games when he was just average at keeping runners off the basepaths?



Player Season WHIP OBP
Reed 1978 1.01 0.273
McGraw 1980 0.92 0.250
Holland 1983 1.01 0.254
Bedrosian 1987 1.20 0.297
Williams 1993 1.61 0.368
Myers* 2007 1.20 0.294
Lidge 2008 1.23 0.297
Lidge 2009 1.81 0.398
Lidge 2010 1.23 0.300
Madson 2011 1.15 0.296
Average (ex 2008) 1.24 0.303

*relief appearances only
Source: Analysis of data from


One reason for the apparent inconsistency is that, like the traditional pitching statistics, normalized statistics also ignore the reliever’s responsibility for keeping inherited runners from scoring. They are also similar in that they treat all at-bats equally. However, even casual fans recognize that some game situations are more risky than others. A pitcher facing a batter with a one-run lead and the bases loaded with no outs is in a far more critical situation than one facing the same batter with two outs, a three-run lead, and no one on.

A number of attempts have been made over the years to quantify these contextual differences by analyzing the results of actual games to determine the likelihood that a play would affect a game’s final outcome, taking into account the specific game situation—like score differential, number of outs, and number and position of baserunners.3 Using the results for a specific situation, we can then go a step further by comparing the probability that a team will win in that situation with the same team’s win probability after the next event (at-bat, stolen base, etc.) The result of this comparison, called Win Probability Added (WPA), can then be allocated to the players involved in the play, mainly the pitcher and batter. For example, on June 6, 2008, Lidge entered a game in Atlanta in the bottom of the 10th with a two-run lead and retired the first batter of the inning, Brian McCann. Before that at-bat, historical data tell us that the Phillies had a 90% probability of winning the game. After the strikeout, that probability rose to 95.1%. As a result, we can credit Lidge with a WPA of 0.051 for recording the out and increasing his team’s chances of winning, and assign McCann a WPA of -.051 for making the out and decreasing his team’s chances. Tom Tango took this analysis a step further by calculating the likelihood that a positive or negative change would occur in each circumstance to come up with a Leverage Index (LI), an estimate of the criticality of each game situation.4 LIs below 0.8 are considered low leverage, and make up most game situations. An LI of between 1.6 and 2.9 is considered high leverage and LIs of 3.0 or more are classified as very high leverage.5

WPA and LI, then, provide additional context about the situations in which a player is used. These measures are particularly relevant for a relief pitcher because of his team’s ability to control the situation in which he is brought into and taken out of a game. Using the data published by,6 we can examine the situational use and performance of our group of closers to better understand their relative contributions, and potentially explain the apparent inconsistency between Lidge’s performance as measured by normalized and traditional statistics.

For example, one possible way a pitcher could appear good at saving games even when he’s only average at keeping runners off base is if he came into games in primarily low-leverage situations. Tango’s Leverage Index can shed some light on whether this is the case. As shown in Table 4, the average LI when Lidge entered a game in 2008 was roughly the same as the average as most of his contemporaries, at least since 1993. The differences in first-batter LI are greater with earlier closers who entered more games earlier than the ninth inning—in fact, half of Reed’s appearances in 1978 began in the seventh inning or earlier. Remember, too, that most of the closers in our sample did not spend the entire season in that role, so they made more appearances in non-save situations. For a more precise comparison, we can look at only those instances where a pitcher entered the game at the start of the ninth inning (which Lidge did in 2008 in all but five of his 72 appearances). In those cases, the differences in LI when they entered are even less pronounced. If anyone could be considered to have pitched in less stressful closing situations, it’s more likely to be Myers or Madson than Lidge in 2008.



Player Season Avg LI
entering game
Avg LI
entering game
at start of 9th
Reed 1978 1.07 1.63
McGraw 1980 1.84 1.93
Holland 1983 2.02 1.92
Bedrosian 1987 1.76 1.62
Williams 1993 1.62 1.67
Myers* 2007 1.56 1.53
Lidge 2008 1.61 1.63
Lidge 2009 1.52 1.57
Lidge 2010 1.70 1.68
Madson 2011 1.51 1.49

*relief appearances only
Source: Analysis of data from


Since Lidge did not apparently have an advantage stemming from the situations at the start of his appearances, we should next look to see if he had any situational advantages during the rest of his time on the mound. To do so, we examine the LI for every game event7 while a pitcher was on the mound and assign each to one of the four leverage categories (low to very high). By comparing the number of events in each category for each pitcher to the total events for that pitcher, we can determine the percentage of events at each of the leverage categories. As shown in Chart 1 [insert], the distribution of high and low leverage events for Lidge in 2008 was not significantly different than those during other years in the sample. Specifically, 55% of the events while Lidge was pitching in 2008 were either low- or average-leverage situations, and 20.4% were very high-leverage situations. In comparison, an average of 54.6% of all situations in the sample were low or average leverage, while 20.7% were very high-leverage situations. Here, too, there does not appear to be a significant situational advantage to Lidge’s performance.

Since Lidge’s relative success cannot be significantly explained by the situations in which he found himself, we should next look to his individual contributions. We can gain some insight into a pitcher’s effectiveness by comparing those instances where he increased his team’s chances of winning (resulting in a positive WPA) to those where he decreased its chances of winning (resulting in a negative WPA). As shown in Table 5, as a group, our championship closers were generally successful, averaging roughly two positive contributions to each negative contribution. In fact, Lidge’s 2008 ratio of 1.81:1 is slightly below the group average. Like many of the other measures discussed previously, however, this comparison does not include any assessment of the relative importance of the contributions made, either positive or negative.



Player Season Positive
WPA events
WPA events
WPA ratio
Reed 1978 319 126 2.53:1
McGraw 1980 266 96 2.77:1
Holland 1983 276 112 2.46:1
Bedrosian 1987 264 122 2.16:1
Williams 1993 176 126 1.40:1
Myers* 2007 156 82 1.90:1
Lidge 2008 205 113 1.81:1
Lidge 2009 174 134 1.30:1
Lidge 2010 135 76 1.78:1
Madson 2011 177 77 2.30:1
Total (ex 2008) 1943 951 2.04:1

*relief appearances only
Source: Analysis of data from


To better assess performance in context of the importance of each situation, we can add the WPA assigned for each individual play instead of just counting them. Because events that are more important to the outcome of a game receive a proportionally higher WPA score, players with a higher WPA can be shown to have had a greater impact on their team’s success. Conversely, players that racked up higher negative WPA counts caused more damage to their team’s chances. Table 6 shows the results of this analysis: a quantitative measure of each pitcher’s relative contributions to his team’s chances of winning. Overall, even successful closers showed a fairly close balance between positive and negative events, with a few more positive WPA points than negative. While this finding initially seems inconsistent with the data in Table 5, the data sets can be attributed to the fact that the negative impact of squandering a team’s existing advantage generally has a greater impact on the team’s chances of winning than just maintaining that advantage. For example, on June 12, 2011, Ryan Madson entered a home game against the Cubs at the start of the ninth inning with a 4–3 lead. In that situation, the Phillies had a 92.3 percent chance to win. No matter how he reduced the team’s chances by letting batters reach base, the greatest positive impact he could have for the whole inning was a 7.7% increase. Conversely, two months later, on August 19, Madson was pitching in Washington with a two-run lead, nobody out, and runners on first and second. He allowed a single to the next batter, driving in a run to tie the game and reducing the Phillies’ chances of winning by 19.4%.



Player Season Total
positive WPA
negative WPA
per 100 events
Reed 1978 7.6330 -5.914 0.539
McGraw 1980 10.656 -6.487 1.567
Holland 1983 11.959 -9.827 0.772
Bedrosian 1987 14.410 -10.935 1.316
Williams 1993 10.925 -11.117 -0.109
Myers* 2007 7.1370 -6.691 0.286
Lidge 2008 10.819 -5.452 2.618
Lidge 2009 8.2320 -12.774 -2.610
Lidge 2010 7.3070 -5.480 1.353
Madson 2011 7.9870 -5.654 1.318
average (ex 2008) 9.5830 -8.320 0.585

*relief appearances only
Source: Analysis of data from


These data also reveal that Lidge’s 2008 performance differed significantly from the group’s in that his positive contributions were twice as large as his negative contributions. To control for differences in the number of events for which each pitcher was responsible, we can also average the net impact of the player’s WPA score and normalize it for a set of 100 events, in effect, estimating a percentage that the pitcher increased his team’s chances of winning for each event he was on the mound. By that measure, we see that Lidge increased the 2008 Phillies’ chances of winning by an average of 2.6 percent for every event (at-bat, stolen base, etc.) while he pitched, while the rest of the group contributed roughly 0.6 percent. It would seem on the surface, then, that Lidge’s success was due to his avoidance of negative events (i.e., retiring batters more often than letting them reach base or advance). However, that assumption is inconsistent with the data in Table 5, that show that Lidge was slightly below average in his ratio of positive to negative contributions. To resolve this discrepancy, we’ll have to dig a little deeper.

One likely explanation for the apparent disconnect between the number of negative events for a pitcher and their value is an unequal distribution of higher-thanaverage WPA events. Because we’re particularly interested in the role of closers, we should focus on appearances in the ninth inning or later. This subset of events is also the group most likely to include high-impact events. Within our sample, events in the ninth inning or later average a WPA of 0.006, but with a standard deviation of 0.099. Assuming a normal distribution, roughly two-thirds of the events in this group should range between 0.105 and -0.093. As a result, we can examine very high-impact events by looking at events outside of this range.

Using these criteria, we can analyze the percentage of events for each pitcher that had a high impact, and their cumulative effect on team winning percentage. As shown in Table 7, Lidge’s distribution of high-impact events was, in fact, uneven.



Player Season Total WPA
for events
Pct. Total WPA
for events
Pct. Total events
9th inning
or later
Reed 1978 1.329 6.40% -3.131 9.90% 141
McGraw 1980 2.542 8.50% -4.003 9.50% 201
Holland 1983 2.514 7.30% -6.580 10.30% 233
Bedrosian 1987 4.723 12.50% -6.372 9.40% 256
Williams 1993 4.733 8.90% -6.311 11.10% 280
Myers 2007 1.801 5.80% -3.940 8.50% 189
Lidge 2008 4.388 8.10% -1.764 3.50% 310
Lidge 2009 3.280 6.50% -8.938 10.60% 292
Lidge 2010 3.574 10.90% -3.648 7.50% 201
Madson 2011 1.824 6.00% -2.823 6.00% 216
average (ex 2008) 2.924 8.20% -5.083 9.30%  


First, Lidge’s high-impact events were overwhelmingly positive. Where the group averaged roughly 8 percent high-impact positive events and roughly 9 percent high-impact negative events, Lidge recorded more than two high-impact positive events for every high-impact negative event. In other words, where the group as a whole mostly failed in the most critical points of the game, Lidge was overwhelmingly successful.

Second, the cumulative WPA of Lidge’s highimpact events was higher than the average, but the total WPA for his high-impact failures was far and away the lowest in the group. By this measure, too, we can see that whenever Lidge had a major impact on his team’s chances of winning, it was overwhelmingly positive. Compare his performance to the others with higher than average positive WPAs in high-leverage situations: They also have even higher total negative WPAs from negative events.

Alternately, we can look at Lidge’s performance in only the events with the highest LI (> 3.0), regardless of inning. As shown in Table 8, in the most critical game situations, Lidge made positive contributions in roughly the same percentage of events as the others in the group. However, the effect of those contributions was significantly more positive than the group’s as a whole. Specifically, the rest of the group averaged an 11 percent improvement in their team’s chances of winning every time they recorded an out, but a 16 percent decrease in those chances when they did not. Lidge, in contrast, averaged a 14 percent improvement in winning percentage for each out, but only a nine percent decrease in those chances in non-out events. Thus, while he made roughly the same percentage of mistakes, his mistakes generally did less damage, and his outs were more valuable.



Player Season Events with
LI > 3
and Positive WPA
Events with
LI > 3
Negative WPA
Avg Positive
WPA for
Events with
LI > 3
Avg Negative
WPA for
Events with
LI > 3
Reed 1978 20 10 10.52% -0.2092
McGraw 1980 46 19 9.45% -0.1505
Holland 1983 56 26 10.40% -0.1753
Bedrosian 1987 60 27 11.77% -0.1538
Williams 1993 51 39 13.47% -0.1490
Myers 2007 28 20 11.50% -0.1842
Lidge 2008 43 22 13.54% -0.0895
Lidge 2009 40 41 11.95% -0.1985
Lidge 2010 31 22 13.79% -0.1283
Madson 2011 46 18 9.44% -0.1127
Average (ex 2008) 42 24.7 0.1133 -0.1628


Some specific game examples can further illustrate the point. In 2008, Lidge’s biggest mistake was during the August 27 game discussed previously. By allowing the Mets to score in a tie game, he reduced the Phillies’ chances of winning by 26.8 percent. The rest of the group made mistakes with a greater negative impact on 43 occasions. The most damaging were:

  • September 5, 2007: With the bases loaded, two out, and a two-run lead, the Phillies have an 82% chance of winning. Brett Myers allows a three-run double to Matt Diaz, resulting in a 9–8 Atlanta win.
  • September 5, 1983: Ninth inning, one out, runners on the corners, and a two-run lead at Shea, the Phillies are nearly 81 percent favorites to win. Al Holland allows a home run to George Foster, and the Mets win.
  • June 5, 2009: At Los Angeles in the bottom of the ninth with two outs, a one-run lead, and the bases loaded, the Phillies are better than 74% favorites. Lidge allows a two-run double to Andre Ethier to drop the game to the Dodgers.


While Lidge’s team-record 41 saves without blowing an opportunity was a rare event, traditional statistics are generally not up to the task of revealing how much of this success was the result of sustained excellence and how much might be attributable to other factors. More recently adopted metrics that measure the criticality of each game situation and an individual’s contribution to his team’s chances of winning can be more illustrative. These metrics reveal that, compared to other Phillies championship closers, Lidge did not gain significant advantages that weren’t related to his own performance. Instead, his accomplishment can be attributed to his own consistent success, mainly in avoiding the big mistakes that result in blown saves. As a result, his 2008 season should be recognized as a truly outstanding individual achievement.

JIM SWEETMAN is founder and host of the Broad & Pattison Phillies website. He resides in Virginia and is Assistant Director at the U.S. Government Accounting Office.



1 Eric Gagne’s 55-for-55 season for Los Angeles in 2003 was the first. Detroit’s Jose Valverde subsequently added his name to this elite list, successfully nailing down all of his 49 save opportunities in 2011.

2 No pitcher recorded at least half of the Phillies’ saves in either the 1976 or 1977 division-winning seasons.

3 For brief discussions of these efforts see Dave Studeman’s articles “The One About Win Probability” at and “WPA in the USA,” Hardball Times Baseball Annual 2007: 122–128.

4 The exact procedures for calculating the Leverage Index are outside of the scope of this paper, but can be found in Tango’s article “Crucial Situations” at

5 See Tango’s tables of LI by situation at


7 Game events are plate appearances plus any other event that can change a games situation without a change in batter, such as stolen bases, caught stealing, balk, etc.