Pitch Perfect: Re-examining Brad Lidge’s Performance in 2008 Using Win Probabilities Added and Leverage Index

August 21, 2013/in Expanded E-edition.2013-TNP /by admin

This article was written by Jim Sweetman

This article was published in The National Pastime: From Swampoodle to South Philly (Philadelphia, 2013)

When Brad Lidge announced his retirement late in 2012, some commentators pointed to the home run he gave up to Albert Pujols in the 2005 NLCS as the defining moment of his career. To Phillies fans, though, the image of Lidge on his knees, arms raised in triumph after recording the final out of the 2008 World Series defines that magical season, if not Lidge’s career as a Phillie. The association of this image to fond memories of 2008 is even more appropriate when one considers Lidge’s overall contributions that year: a perfect record in 41 regular-season and seven postseason save opportunities. He became only the second player in major-league history to record at least 40 saves in a single season without blowing an opportunity.1

And yet, while Lidge was perfect in save opportunities, he was not completely perfect. On July 25, he entered the game in the ninth with the Phillies down, 1–0, and gave up five runs to the Braves without recording an out. A month later he gave up two earned runs to the Mets in a game he entered with the score tied. He gave up single runs in six games in which he earned a save, thanks to a cushion provided by the potent Phillies offense. Surely, no full-time reliever has ever had a completely perfect season, and these few blemishes offer only the slightest hint at the decline in performance Lidge would experience in subsequent seasons. However, they do raise the question of how much better Lidge’s performance was than other championship Phillies closers.

For the purposes of this exercise, we define a championship closer as a pitcher who earned half or more of the Phillies’ saves in a year in which they won at least a division title since the adoption of the save as an official statistic in 1969: Ron Reed (1978), Tug McGraw (1980), Al Holland (1983), Mitch Williams (1993), Brett Myers (2007), Ryan Madson (2011), and Lidge himself in 2008, 2009 and 2010.2 For additional context we’ve added Steve Bedrosian, who earned a Cy Young Award as Phillies closer in 1987, becoming the first pitcher in team history to record 40 saves in one season.

Using traditional statistics, Lidge’s 2008 season stacks up very well against the comparison group (see Table 1). His ERA is second only to McGraw’s in 1980 and more than a run better than the group as a whole. He won only two games, but that was two more than he lost; the rest of the group averaged roughly even win-loss totals. Also, while Lidge never took a loss or blown save in 2008, on average, the others in the championship closers group failed to close out games left in their hands through a loss or blown save roughly nine times per year.

TABLE 1: PHILLIES CHAMPIONSHIP CLOSERS, TRADITIONAL STATISTICS

Player	Season	W	L	S	BS	ERA
Reed	1978	3	4	17	2	2.24
McGraw	1980	5	4	20	5	1.46
Holland	1983	8	4	25	7	2.26
Bedrosian	1987	5	3	40	8	2.83
Williams	1993	3	7	43	6	3.34
Myers*	2007	5	5	21	3	2.87
Lidge	2008	2	0	41	0	1.95
Lidge	2009	0	8	31	11	7.21
Lidge	2010	1	1	27	5	2.96
Madson	2011	4	2	32	2	2.37
Average	(ex 2008)	3.8	4.2	28.4	5.4	3.05

* Throughout this paper, Myers’ stats reflect only relief appearances and exclude his 3 starts in 2007.
Source: Analysis of data from mlb.com

However, traditional statistics have well-known limitations in measuring the performance of relief pitchers. For example, ERA does not capture a pitcher’s ability to keep inherited runners from scoring. Wins can be added and losses avoided when a pitcher is backed by an offense that has the ability to win games with late runs. A save can be awarded to a pitcher facing a single batter with a one-run lead, or one allowing two runs when entering with a three-run lead. This partly explains McGraw’s average number of losses and blown saves despite a significantly better-than-average ERA.

These statistics can also have limited utility in comparing pitchers if there are significant differences in how the pitchers were used, a factor that has varied over time. In the period covered by our sample, Phillies closers were used for increasingly shorter appearances. Whereas Reed and McGraw averaged almost 1 2/3 innings per appearance, Holland and Bedrosian averaged about an inning and a third, while Williams and most of his twenty-first century counterparts averaged just under an inning per appearance (see Table 2). This trend was not limited to the Phillies: The top 15 closers in saves in 1980 averaged 1.58 innings pitched per appearance, while the top 15 in 2009 averaged 0.98.

TABLE 2: INNINGS PITCHED PER GAME FOR PHILLIES CHAMPIONSHIP CLOSERS

Player	Season	G	IP	IP/G
Reed	1978	66	108.7	1.65
McGraw	1980	57	92.3	1.62
Holland	1983	68	91.7	1.35
Bedrosian	1987	65	89.0	1.37
Williams	1993	65	62.0	0.95
Myers*	2007	48	53.3	1.11
Lidge	2008	72	69.3	0.96
Lidge	2009	67	58.7	0.88
Lidge	2010	50	45.7	0.91
Madson	2011	62	60.7	0.98

*relief appearances only
Source: Analysis of data from mlb.com

In addition to differences in how closers were used within games, the number of games they appeared in varied, in part due to missed time during the season or changing roles within the pitching staff. McGraw injured himself trying to learn a side-arm pitch and went on the 21-day DL in the middle of the 1980 [source: TSN, 7-19-80, p. 27]. Holland opened the 1983 season on the DL with a sore shoulder [source: TSN, 4-18-83, p. 14]. In 2007, Myers was the Opening Day starter, but made only two more starts before being moved to the bullpen, first as a set-up man, then as closer to replace the injured Tom Gordon [source: PDN, 4-19-2007, p. 94 & 5-3-2007 p. 88]. He subsequently got injured late in May [source: Phila. Inquirer, 5-26-2007, p. E1] and then returned as closer in July. Lidge missed time with a sprained knee in 2009 and opened the 2010 season on the DL after elbow surgery [source: mlb.com transactions]. Ryan Madson started 2011 as a set-up man for Jose Contreras, then took over as closer in late April [source: Phila Inquirer 4-25-2011 p E12] before missing a month with a hand injury [source: PDN 7-16-2011 p. 29].

More recently adopted statistics can control for some of these differences by normalizing performance measures on a per-inning or per-batter basis. For example, WHIP (walks plus hits per inning pitched) and OBP (opponents’ on-base percentage) both aim to measure how well pitchers perform their primary task, keeping batters off base. Using these measures, Lidge’s 2008 season looks about average: not as good as McGraw or Holland, but not nearly as bad as the notorious tightrope walkers like Williams or the 2009 version of Lidge (see Table 3). Instead of helping to explain Lidge’s success in 2008, these stats instead raise a new question: How could he have been so successful at closing out games when he was just average at keeping runners off the basepaths?

TABLE 3: PHILLIES CHAMPIONSHIP PHILLIES CLOSERS, NORMALIZED STATISTICS (WHIP AND OBP)

Player	Season	WHIP	OBP
Reed	1978	1.01	0.273
McGraw	1980	0.92	0.250
Holland	1983	1.01	0.254
Bedrosian	1987	1.20	0.297
Williams	1993	1.61	0.368
Myers*	2007	1.20	0.294
Lidge	2008	1.23	0.297
Lidge	2009	1.81	0.398
Lidge	2010	1.23	0.300
Madson	2011	1.15	0.296
Average	(ex 2008)	1.24	0.303

*relief appearances only
Source: Analysis of data from mlb.com

One reason for the apparent inconsistency is that, like the traditional pitching statistics, normalized statistics also ignore the reliever’s responsibility for keeping inherited runners from scoring. They are also similar in that they treat all at-bats equally. However, even casual fans recognize that some game situations are more risky than others. A pitcher facing a batter with a one-run lead and the bases loaded with no outs is in a far more critical situation than one facing the same batter with two outs, a three-run lead, and no one on.

A number of attempts have been made over the years to quantify these contextual differences by analyzing the results of actual games to determine the likelihood that a play would affect a game’s final outcome, taking into account the specific game situation—like score differential, number of outs, and number and position of baserunners.3 Using the results for a specific situation, we can then go a step further by comparing the probability that a team will win in that situation with the same team’s win probability after the next event (at-bat, stolen base, etc.) The result of this comparison, called Win Probability Added (WPA), can then be allocated to the players involved in the play, mainly the pitcher and batter. For example, on June 6, 2008, Lidge entered a game in Atlanta in the bottom of the 10th with a two-run lead and retired the first batter of the inning, Brian McCann. Before that at-bat, historical data tell us that the Phillies had a 90% probability of winning the game. After the strikeout, that probability rose to 95.1%. As a result, we can credit Lidge with a WPA of 0.051 for recording the out and increasing his team’s chances of winning, and assign McCann a WPA of -.051 for making the out and decreasing his team’s chances. Tom Tango took this analysis a step further by calculating the likelihood that a positive or negative change would occur in each circumstance to come up with a Leverage Index (LI), an estimate of the criticality of each game situation.4 LIs below 0.8 are considered low leverage, and make up most game situations. An LI of between 1.6 and 2.9 is considered high leverage and LIs of 3.0 or more are classified as very high leverage.5

WPA and LI, then, provide additional context about the situations in which a player is used. These measures are particularly relevant for a relief pitcher because of his team’s ability to control the situation in which he is brought into and taken out of a game. Using the data published by FanGraphs.com,6 we can examine the situational use and performance of our group of closers to better understand their relative contributions, and potentially explain the apparent inconsistency between Lidge’s performance as measured by normalized and traditional statistics.

For example, one possible way a pitcher could appear good at saving games even when he’s only average at keeping runners off base is if he came into games in primarily low-leverage situations. Tango’s Leverage Index can shed some light on whether this is the case. As shown in Table 4, the average LI when Lidge entered a game in 2008 was roughly the same as the average as most of his contemporaries, at least since 1993. The differences in first-batter LI are greater with earlier closers who entered more games earlier than the ninth inning—in fact, half of Reed’s appearances in 1978 began in the seventh inning or earlier. Remember, too, that most of the closers in our sample did not spend the entire season in that role, so they made more appearances in non-save situations. For a more precise comparison, we can look at only those instances where a pitcher entered the game at the start of the ninth inning (which Lidge did in 2008 in all but five of his 72 appearances). In those cases, the differences in LI when they entered are even less pronounced. If anyone could be considered to have pitched in less stressful closing situations, it’s more likely to be Myers or Madson than Lidge in 2008.

TABLE 4: AVERAGE LI UPON ENTERING ALL GAMES VS. ENTERING AT START OF NINTH INNING

Player	Season	Avg LI entering game	Avg LI entering game at start of 9th
Reed	1978	1.07	1.63
McGraw	1980	1.84	1.93
Holland	1983	2.02	1.92
Bedrosian	1987	1.76	1.62
Williams	1993	1.62	1.67
Myers*	2007	1.56	1.53
Lidge	2008	1.61	1.63
Lidge	2009	1.52	1.57
Lidge	2010	1.70	1.68
Madson	2011	1.51	1.49

*relief appearances only
Source: Analysis of data from FanGraphs.com

Since Lidge did not apparently have an advantage stemming from the situations at the start of his appearances, we should next look to see if he had any situational advantages during the rest of his time on the mound. To do so, we examine the LI for every game event7 while a pitcher was on the mound and assign each to one of the four leverage categories (low to very high). By comparing the number of events in each category for each pitcher to the total events for that pitcher, we can determine the percentage of events at each of the leverage categories. As shown in Chart 1 [insert], the distribution of high and low leverage events for Lidge in 2008 was not significantly different than those during other years in the sample. Specifically, 55% of the events while Lidge was pitching in 2008 were either low- or average-leverage situations, and 20.4% were very high-leverage situations. In comparison, an average of 54.6% of all situations in the sample were low or average leverage, while 20.7% were very high-leverage situations. Here, too, there does not appear to be a significant situational advantage to Lidge’s performance.

Since Lidge’s relative success cannot be significantly explained by the situations in which he found himself, we should next look to his individual contributions. We can gain some insight into a pitcher’s effectiveness by comparing those instances where he increased his team’s chances of winning (resulting in a positive WPA) to those where he decreased its chances of winning (resulting in a negative WPA). As shown in Table 5, as a group, our championship closers were generally successful, averaging roughly two positive contributions to each negative contribution. In fact, Lidge’s 2008 ratio of 1.81:1 is slightly below the group average. Like many of the other measures discussed previously, however, this comparison does not include any assessment of the relative importance of the contributions made, either positive or negative.

TABLE 5: POSITIVE AND NEGATIVE CONTRIBUTIONS BY PITCHER (PER-EVENT WPA)

Player	Season	Positive WPA events	Negative WPA events	Positive/negative WPA ratio
Reed	1978	319	126	2.53:1
McGraw	1980	266	96	2.77:1
Holland	1983	276	112	2.46:1
Bedrosian	1987	264	122	2.16:1
Williams	1993	176	126	1.40:1
Myers*	2007	156	82	1.90:1
Lidge	2008	205	113	1.81:1
Lidge	2009	174	134	1.30:1
Lidge	2010	135	76	1.78:1
Madson	2011	177	77	2.30:1
Total	(ex 2008)	1943	951	2.04:1

*relief appearances only
Source: Analysis of data from FanGraphs.com

To better assess performance in context of the importance of each situation, we can add the WPA assigned for each individual play instead of just counting them. Because events that are more important to the outcome of a game receive a proportionally higher WPA score, players with a higher WPA can be shown to have had a greater impact on their team’s success. Conversely, players that racked up higher negative WPA counts caused more damage to their team’s chances. Table 6 shows the results of this analysis: a quantitative measure of each pitcher’s relative contributions to his team’s chances of winning. Overall, even successful closers showed a fairly close balance between positive and negative events, with a few more positive WPA points than negative. While this finding initially seems inconsistent with the data in Table 5, the data sets can be attributed to the fact that the negative impact of squandering a team’s existing advantage generally has a greater impact on the team’s chances of winning than just maintaining that advantage. For example, on June 12, 2011, Ryan Madson entered a home game against the Cubs at the start of the ninth inning with a 4–3 lead. In that situation, the Phillies had a 92.3 percent chance to win. No matter how he reduced the team’s chances by letting batters reach base, the greatest positive impact he could have for the whole inning was a 7.7% increase. Conversely, two months later, on August 19, Madson was pitching in Washington with a two-run lead, nobody out, and runners on first and second. He allowed a single to the next batter, driving in a run to tie the game and reducing the Phillies’ chances of winning by 19.4%.

TABLE 6: TOTAL POSITIVE, NEGATIVE, AND NET AVERAGE WPA

Player	Season	Total positive WPA	Total negative WPA	Net WPA per 100 events
Reed	1978	7.6330	-5.914	0.539
McGraw	1980	10.656	-6.487	1.567
Holland	1983	11.959	-9.827	0.772
Bedrosian	1987	14.410	-10.935	1.316
Williams	1993	10.925	-11.117	-0.109
Myers*	2007	7.1370	-6.691	0.286
Lidge	2008	10.819	-5.452	2.618
Lidge	2009	8.2320	-12.774	-2.610
Lidge	2010	7.3070	-5.480	1.353
Madson	2011	7.9870	-5.654	1.318
average	(ex 2008)	9.5830	-8.320	0.585

*relief appearances only
Source: Analysis of data from FanGraphs.com

These data also reveal that Lidge’s 2008 performance differed significantly from the group’s in that his positive contributions were twice as large as his negative contributions. To control for differences in the number of events for which each pitcher was responsible, we can also average the net impact of the player’s WPA score and normalize it for a set of 100 events, in effect, estimating a percentage that the pitcher increased his team’s chances of winning for each event he was on the mound. By that measure, we see that Lidge increased the 2008 Phillies’ chances of winning by an average of 2.6 percent for every event (at-bat, stolen base, etc.) while he pitched, while the rest of the group contributed roughly 0.6 percent. It would seem on the surface, then, that Lidge’s success was due to his avoidance of negative events (i.e., retiring batters more often than letting them reach base or advance). However, that assumption is inconsistent with the data in Table 5, that show that Lidge was slightly below average in his ratio of positive to negative contributions. To resolve this discrepancy, we’ll have to dig a little deeper.

One likely explanation for the apparent disconnect between the number of negative events for a pitcher and their value is an unequal distribution of higher-thanaverage WPA events. Because we’re particularly interested in the role of closers, we should focus on appearances in the ninth inning or later. This subset of events is also the group most likely to include high-impact events. Within our sample, events in the ninth inning or later average a WPA of 0.006, but with a standard deviation of 0.099. Assuming a normal distribution, roughly two-thirds of the events in this group should range between 0.105 and -0.093. As a result, we can examine very high-impact events by looking at events outside of this range.

Using these criteria, we can analyze the percentage of events for each pitcher that had a high impact, and their cumulative effect on team winning percentage. As shown in Table 7, Lidge’s distribution of high-impact events was, in fact, uneven.

TABLE 7: PERCENTAGE OF HIGH-IMPACT EVENTS

Player	Season	Total WPA for events >0.11	Pct.	Total WPA for events <-0.11	Pct.	Total events 9th inning or later
Reed	1978	1.329	6.40%	-3.131	9.90%	141
McGraw	1980	2.542	8.50%	-4.003	9.50%	201
Holland	1983	2.514	7.30%	-6.580	10.30%	233
Bedrosian	1987	4.723	12.50%	-6.372	9.40%	256
Williams	1993	4.733	8.90%	-6.311	11.10%	280
Myers	2007	1.801	5.80%	-3.940	8.50%	189
Lidge	2008	4.388	8.10%	-1.764	3.50%	310
Lidge	2009	3.280	6.50%	-8.938	10.60%	292
Lidge	2010	3.574	10.90%	-3.648	7.50%	201
Madson	2011	1.824	6.00%	-2.823	6.00%	216
average	(ex 2008)	2.924	8.20%	-5.083	9.30%

First, Lidge’s high-impact events were overwhelmingly positive. Where the group averaged roughly 8 percent high-impact positive events and roughly 9 percent high-impact negative events, Lidge recorded more than two high-impact positive events for every high-impact negative event. In other words, where the group as a whole mostly failed in the most critical points of the game, Lidge was overwhelmingly successful.

Second, the cumulative WPA of Lidge’s highimpact events was higher than the average, but the total WPA for his high-impact failures was far and away the lowest in the group. By this measure, too, we can see that whenever Lidge had a major impact on his team’s chances of winning, it was overwhelmingly positive. Compare his performance to the others with higher than average positive WPAs in high-leverage situations: They also have even higher total negative WPAs from negative events.

Alternately, we can look at Lidge’s performance in only the events with the highest LI (> 3.0), regardless of inning. As shown in Table 8, in the most critical game situations, Lidge made positive contributions in roughly the same percentage of events as the others in the group. However, the effect of those contributions was significantly more positive than the group’s as a whole. Specifically, the rest of the group averaged an 11 percent improvement in their team’s chances of winning every time they recorded an out, but a 16 percent decrease in those chances when they did not. Lidge, in contrast, averaged a 14 percent improvement in winning percentage for each out, but only a nine percent decrease in those chances in non-out events. Thus, while he made roughly the same percentage of mistakes, his mistakes generally did less damage, and his outs were more valuable.

TABLE 8: MOST CRITICAL GAME SITUATIONS

Player	Season	Events with LI > 3 and Positive WPA	Events with LI > 3 and Negative WPA	Avg Positive WPA for Events with LI > 3	Avg Negative WPA for Events with LI > 3
Reed	1978	20	10	10.52%	-0.2092
McGraw	1980	46	19	9.45%	-0.1505
Holland	1983	56	26	10.40%	-0.1753
Bedrosian	1987	60	27	11.77%	-0.1538
Williams	1993	51	39	13.47%	-0.1490
Myers	2007	28	20	11.50%	-0.1842
Lidge	2008	43	22	13.54%	-0.0895
Lidge	2009	40	41	11.95%	-0.1985
Lidge	2010	31	22	13.79%	-0.1283
Madson	2011	46	18	9.44%	-0.1127
Average	(ex 2008)	42	24.7	0.1133	-0.1628

Some specific game examples can further illustrate the point. In 2008, Lidge’s biggest mistake was during the August 27 game discussed previously. By allowing the Mets to score in a tie game, he reduced the Phillies’ chances of winning by 26.8 percent. The rest of the group made mistakes with a greater negative impact on 43 occasions. The most damaging were:

September 5, 2007: With the bases loaded, two out, and a two-run lead, the Phillies have an 82% chance of winning. Brett Myers allows a three-run double to Matt Diaz, resulting in a 9–8 Atlanta win.
September 5, 1983: Ninth inning, one out, runners on the corners, and a two-run lead at Shea, the Phillies are nearly 81 percent favorites to win. Al Holland allows a home run to George Foster, and the Mets win.
June 5, 2009: At Los Angeles in the bottom of the ninth with two outs, a one-run lead, and the bases loaded, the Phillies are better than 74% favorites. Lidge allows a two-run double to Andre Ethier to drop the game to the Dodgers.

CONCLUSIONS

While Lidge’s team-record 41 saves without blowing an opportunity was a rare event, traditional statistics are generally not up to the task of revealing how much of this success was the result of sustained excellence and how much might be attributable to other factors. More recently adopted metrics that measure the criticality of each game situation and an individual’s contribution to his team’s chances of winning can be more illustrative. These metrics reveal that, compared to other Phillies championship closers, Lidge did not gain significant advantages that weren’t related to his own performance. Instead, his accomplishment can be attributed to his own consistent success, mainly in avoiding the big mistakes that result in blown saves. As a result, his 2008 season should be recognized as a truly outstanding individual achievement.

JIM SWEETMAN is founder and host of the Broad & Pattison Phillies website. He resides in Virginia and is Assistant Director at the U.S. Government Accounting Office.

Notes

1 Eric Gagne’s 55-for-55 season for Los Angeles in 2003 was the first. Detroit’s Jose Valverde subsequently added his name to this elite list, successfully nailing down all of his 49 save opportunities in 2011. http://www.foxnews.com/sports/2011/10/28/tigers-exercise-2012-option-on-valverde/

2 No pitcher recorded at least half of the Phillies’ saves in either the 1976 or 1977 division-winning seasons.

3 For brief discussions of these efforts see Dave Studeman’s articles “The One About Win Probability” at http://www.hardballtimes.com/main/article/the-one-about-win-probability/ and “WPA in the USA,” Hardball Times Baseball Annual 2007: 122–128.

4 The exact procedures for calculating the Leverage Index are outside of the scope of this paper, but can be found in Tango’s article “Crucial Situations” at http://www.hardballtimes.com/main/article/crucial-situations/.

5 See Tango’s tables of LI by situation at http://www.insidethebook.com/li.shtml.

6 http://www.fangraphs.com/scoreboard.aspx.

7 Game events are plate appearances plus any other event that can change a games situation without a change in batter, such as stolen bases, caught stealing, balk, etc.

Search the Research Collection

SABR Analytics Conference

Pitch Perfect: Re-examining Brad Lidge’s Performance in 2008 Using Win Probabilities Added and Leverage Index

Support SABR today!