Park Neutralized Stats

One thing that gives people like me headaches is having to deal with the fact that all ballparks are different. Don’t get me wrong, the uniqueness of the sport is part of what makes baseball so special. But it’s an imperfect science when attempting to compare players from different teams.

The other day, my friend and baseball statistic guru Ryan Spaeder was asked which park was the toughest on hitters, and this was his response:

I thought it was perfect. Obviously, Coors Field is easily the most “hitter friendly” ballpark, but it is almost a no-win situation for a hitter. The Rockies have zero Hall of Famers in their 24 years of existence and their two best candidates (Larry Walker and Todd Helton) are unlikely to get in any time soon. The argument is their stats are inflated due to Coors Field, although it is impossible to say by exactly how much. We have park neutralized metrics such as OPS+ and wRC+ which include a park adjustment, and while I use them regularly, I admit they aren’t perfect.

The common practice in adjusting for ballpark is to take the ballpark factor, which estimates how a park influences run scoring compared to league average, and apply to it to each hitter. However, all players receive the same adjustment, no matter if they bat left-handed or right-handed, or if they are fly ball, ground ball, pull or spray hitters, etc. The assumption that all types of hitters should be treated the same is what I’m attempting to correct.

So while I know I’m not completely fixing the problem here, I’m offering an alternative. What I have are park adjusted career totals based off home and road splits. My first attempt was to just multiply road stats by two, but that would completely eliminate the player from playing ANY games in their home park and turn ballpark advantages into disadvantages and vice versa.

Instead, I decided to include home stats, but only at the same rate that a player would visit other parks. For example, Babe Ruth played in an era when his league had eight teams. That means that if he played in all ballparks an equal amount of time, he’d play in his home ballpark 12.5% (1/8) of his games. The other 87.5% would come from his road stats. A player in 2016 would play in their home park 6.67% (1/15) of the time.

Example

The formula is simple. For Babe Ruth’s home runs, we take his home HR/PA (347 / 5150 = .0674) and divide that by the number of teams in his league (.0674 / 8 = .00842). Next we take his road HR/PA (367 / 5473 = .0671) and multiply that by (7 / 8 = .875), which is the number of opponents in the league divided by the number of teams in the league (.0671 * .875 = .0587). Next, we add those two numbers together to get his new HR rate (.00842 + .0587 = .06712). To get his final career HR total, we multiply his rate by his career plate appearances (.06712 * 10623 = 713 HR). Surprisingly, he actually loses a HR, even though he played most of his career with a short right field at Yankee Stadium. We’ll see, however, that many players have a bigger difference in their adjusted career totals.

500 HR
After neutralizing the stats, there are no new members of the 500 HR club, although we lose six players. The biggest drop is by Mel Ott, who loses over 20% of his career total. Ott played his entire career at The Polo Grounds, where the right field foul pole was just 258 feet from home plate. During his career there, lefthanded batters hit about 80% more HR there than they did at the other National League parks.
Capture
What is interesting about Mel Ott and The Polo Grounds is that while it allowed Ott to hit far more home runs, it came at the expense of other hits. So we take away 104 HR, but also credit him 83 singles, 87 doubles, and 21 triples. Overall, his production was increased at home, but not by as much as his home run total would indicate.

The second biggest drop is a bit surprising in Frank Thomas, who lost 90 home runs. Comiskey Park does favor HR hitters, but it’s far from the most drastic ballpark. Still, over his career, Thomas hit a HR in 6.2% of his plate appearances at home and 4.1% on the road.

David Ortiz is not only the biggest gainer on the list, but also among all players. He went from 525 HR to 589, and we’ll see that Fenway Park wreaks havoc on these neutralized stats.
HR Change

The common theme with both of these lists is that those who saw their totals increase all played in parks that favored pitchers with the long ball, while those who saw their totals decrease played in parks that favored home runs.

Now let’s look at what is probably the second most popular career batting list, the “3,000 Hit Club”. (Note: Since we only have home/away splits going back to 1913, any player who began their career before this season is not included. Thus, no Cobb, Wagner or Speaker).
3000 Hits
Just as with the “500 HR Club”, the “3000 Hit Club” only lost players. While some players see an increase their production and some see a decrease from these neutralized stats, as a whole, players will lose some production. This is due to home field advantage and because the majority of these neutralized stats are influenced by road stats. This may partly explain why there are no new members of either club.

Already, we see some of Fenway Park’s impact with David Ortiz’s increase in his home run total and both Yaz and Boggs seeing big decreases in their hit totals. Maybe the most telling is the list of players that saw the biggest decrease in their doubles.
Capture
Nine of the top 10 played most or all of their career at Fenway Park. You just don’t see this type of thing in other sports.

Players with Big Increases in Production

Capture
A fun thought experiment is to imagine how their careers would have turned out had Joe DiMaggio and Ted Williams been traded for each other, with DiMaggio taking advantage of the Green Monster and Williams facing a short RF porch. Instead for DiMaggio, he had to contend with 451 feet to left-center field at Yankee Stadium. We estimate that with a neutral park, he would hit 45 more home runs and increase his overall production, with 31 more points of OPS.

Capture
Rick Wilkins may be the player farthest from your mind when you started reading, yet here he is. What is most amazing about him is just how drastic his home/road splits were when he spent the majority of his career at a hitters park in Wrigley Field. For his career, Wilkins hit .216/.298/.350 at home and .272/.366/.471 on the road. Who knows? Maybe he just got incredible nights sleep in hotel beds.

Capture
It’s no secret that AT&T Park favors pitchers, and what makes Buster Posey so special is that his raw stats are impressive, even before a park adjustment. But if we estimate what they would look like at more favorable parks, it becomes even more obvious that he’s on an early path to the Hall of Fame.

Capture
In the Willie Davis comment in his New Historical Baseball Abstract, Bill James describes a method for converting a players stats from one run environment to another. This has come to be known as the “Willie Davis Method” and it is currently used on this site and is the basis for Baseball-Reference’s neutralized stats (with some additional adjustments). The problem with this method is it treats all batting events the same and are adjusted at the same proportion. As we have seen with Fenway Park, this is not the case. Anyway, Bill James introduced his method in Davis’s player comment because he spent much of his career at a horrible hitters park. As we can see from this neutralization method, Davis’s stats improve, and his +29 triples and +210 total bases are the most of any player in history.

Capture
As if Mike Piazza didn’t already have the most impressive statistics for any catcher in baseball history, they get even better after they are neutralized. In fact, every single home ballpark during Piazza’s career had a park factor below 1.

Players with Big Decreases in Production

Capture
Chuck Klein spent much of his career in the Baker Bowl, which was 280 ft to right field and 300 ft to right-center field. So it’s easy to see why he hit 63% of his home runs at home. When we neutralize his stats, his overall numbers are much less impressive, especially given the hitters era in which he played.

Capture
As we saw earlier, Wade Boggs takes a big hit, with his OPS dropping 63 points. This is similar to other Red Sox players, such as Bobby Doerr (-.084), Rico Petrocelli (-.069), Dom DiMaggio (-.059), Jim Rice (-.053) and Carl Yastrzemski (-.051).

Capture
Barry Larkin is an interesting case. He has a 21 point drop in OPS while his hit total increases by 29. The biggest change was losing 130 walks, decreasing his walk percentage from 10.4% to 8.9%. In fact, Riverfront Stadium regularly increased walks for right handed batters every season of Larkin’s career. This is just a reminder that park factors are not limited to balls in play.

Capture
As mentioned above, playing at Coors Field can be a no-win situation. Obviously, Larry Walkers stats have received a boost. But if we neutralize them, he still compares well to these two Hall of Famers.
Capture
Throw in 9 Gold Gloves and +94 fielding runs, and this should quell any fears you may have about his Coors-inflated stats.

Capture
CarGo loses 102 points in OPS, which is the most of anyone with at least 2000 career plate appearances. This may indicate that his style of play is more affected by Coors Field. It’s also possible that he has a tougher time adjusting to the different approaches opposing pitchers take on the road, as Eno Sarris suggests. This may point to a flaw in the neutralization method, especially for extreme ballparks like Coors Field.

Change in Type of Production

Capture
Hank Aaron played his career at two parks, Milwaukee County Stadium and Atlanta-Fulton County Stadium. Milwaukee favored pitchers in terms of the long ball, but Atlanta was known as “The Launching Pad” and had a big affect on home runs. Naturally, he sees a drop in his home run total, but also an increase in singles, doubles and triples. The overall level of production didn’t change much after neutralization, just how it changed.

Capture
Jay Buhner’s neutralized OPS is nearly identical to his actual OPS, but his peripherals see some changes. His singles and home runs increase, but his singles, doubles and walks decrease. This is just another example that a ballpark can change how a game is played while having very little impact on the run environment.

Flaws in the System

The unbalanced schedule and interleague play make it so not all teams visit the same ballpark an equal amount of times. This means that Rockies players will visit pitchers parks such as Petco, At&T and Dodgers Stadium more often than teams in other divisions. It’s possible this can be corrected by equalizing the amount a team will see a road park. However, this would complicate the process and I’m not completely comfortable with it.

As mentioned in the Carlos Gonzalez comment, it is possible that some players’ away stats are affected due to different approaches taken by pitchers based on the ballpark. I suspect this is only the case in the very extreme and unique parks. It is something to keep an eye on.

This method only uses career statistics, which contain large samples when dealing with home/road splits. A single season may not offer a big enough sample to completely trust, especially with part-time players.

Conclusion

This method is admittedly imperfect, but it does fix the problem with applying the same run adjustment to all players. If anything, it is an alternative to other methods of neutralizing ballparks. I’m open to any suggestions on improving this method and I may publish a pitching neutralization shortly.

For those interested, I have included a spreadsheet with neutralized stats that can be viewed here. It includes all players with at least 1000 PA and began their career after 1912.

Gauging the First Half

Instead of making a post about mid-season awards, which we are sure to see a few of during the All-Star break, I figured I’d try something different. Let’s take a look at how individual plays affect a team’s postseason probabilities.

Top Plays

Earlier in the season, I added a page that shows the top plays of the season in terms of win probability added. While placing a value on the importance of the individual game is interesting, we can take it one step further and look at how each play impacts a team’s playoff probability. A big hit in a game between two teams that are not in contention will have little to no effect. But a walk-off home run in a game between teams tied for the lead in a division will have a much greater impact. So let’s take a look at the biggest plays of the first half.

This list is sorted by championship win probability added (cWPA). Just as in-game win probability added shows the change in win probability in terms of percentage points, cWPA shows the change in World Series win probability. Your first thought upon seeing the cWPA values is probably how small they are. In fact, every play this season has a cWPA of less than 1 percentage point. This shows just how little of an impact, even the most important play of the first three months of the season, has on a team’s chances of winning the World Series. Another way to look at these numbers is to multiply them by 8, to see the change in probability of being one of the final 8 playoff teams.

1) Leonys Martin’s walk-off HR (0.71 cWPA)


With two outs and a runner on 2nd in the bottom of the ninth and his team down by a run, Martin fell behind in the count 1-2 on three Ryan Madson changeups. On the fourth pitch, Madson went changeup again and Martin deposited it into the right field bleachers. The walk-off increased the Mariners in-game win probability by 86 percentage points, but more importantly, it increased the Mariners probability of winning the World Series by 0.71 percentage points.
On a side note, this play is also 40th on our list, as it decreased the A’s World Series win probability by 0.38 percentage points. The change in percentage points is bigger for the Mariners since the game was of more importance, as they were ahead in the division by 1.5 games, while Oakland was 7 games back.

2) Salvador Perez’s go-ahead 2-R HR (0.63 WPA)


In the bottom of the 8th inning with two outs, Bryan Shaw was looking to send the game to the 9th with his team up by a run. He was facing Salvador Perez, who was 1 for 12 in his career vs Shaw. But on the 1st pitch, Perez gave the fans in left field a souvenir and his team the lead. This play decreased the Indians World Series win probability by 0.63 percentage points. You can actually see the moment when Bryan Shaw realizes that pitcher vs batter stats are too small of samples to trust.Bryan Shaw
This play is also 6th on our list, as it increased the Royals World Series win probability by 0.59 percentage points.

3) Ian Desmond’s go-ahead 2-R HR (0.62 cWPA)


The next two plays on this list are from the same crazy game in Oakland. The A’s were one out away from victory with Ryan Madson on the mound, when Ian Desmond gave Texas the lead with a 2-run HR off a changeup. This play increased the Rangers chances of winning the World Series by 0.62 percentage points. As Desmond rounded the bases, Rangers announcer Tom Grieve noted that Madson threw one too many changeups, which seems to be a recurring theme here.
This play is also 24th on this list, as it increased Oakland’s World Series win probability by 0.41 percentage points.

4) Khris Davis’s walk-off grand slam (0.61 cWPA)


The next half-inning, Texas closer Shawn Tolleson intentionally walked Josh Reddick to load the bases with one out. Next, Danny Valencia flew out to shallow right, which brought up Khris Davis, who had already hit two home runs in the game. Davis then ended the game on a walk-off grand slam, which left Adrian Beltre wondering “what the hell just happened”.
Capture
From the A’s perspective, this play is 27th on the list, as it decreased their World Series win probability by 0.40 percentage points.

5) Yasiel Puig’s walk-off single and error by Michael Taylor (0.60 cWPA)


Puig’s single would have put runners on 1st and 2nd with one out in the inning, but it was Taylor’s gaffe (and Puig’s hustle) that allowed both runners to score. This was the culmination of Michael Taylor’s horrendous game, where he also struck out in all five of his at bats. If you look close enough, you can see him calculating the cWPA in his head.
Capture

Here are the rest of our top 25 plays of the first half:

Rk Date Play Team VS cWPA Highlight
6 6/14 Salvador Perez HR KC CLE
+0.59
7 5/21 Matt Wieters HR BAL LAA
+0.58
8 7/08 Luis Valbuena HR HOU OAK
+0.56
9 6/22 Yasiel Puig little league HR WAS LAD
-0.56
10 6/12 Jayson Werth Single WAS PHI
+0.56
11 5/14 Albert Pujols HR SEA LAA
-0.52
12 6/05 Matt Wieters 1B BAL NYY
+0.47
13 7/07 Troy Tulowitzki 1B TOR DET
+0.47
14 4/12 Geovany Soto HR OAK LAA
+0.45
15 5/20 Melvin Upton HR LAD SD
-0.44
16 5/10 Ryan Rua HR TEX CHW
+0.43
17 4/12 Geovany Soto HR LAA OAK
-0.43
18 4/08 Starling Marte grand slam PIT CIN
+0.42
19 4/08 Starling Marte grand slam CIN PIT
-0.42
20 6/24 Adam Lind HR STL SEA
-0.42
21 5/21 Jayson Werth GIDP WAS MIA
-0.42
22 5/28 Drew Butera 2B CHW KC
-0.41
23 6/23 Adonis Garcia HR NYM ATL
-0.41
24 5/17 Ian Desmond HR OAK TEX
-0.41
25 6/11 Prince Fielder HR SEA TEX
-0.41

Most Critical Moments

We can measure the importance that a particular play has on a game by using leverage index (LI), but this is limited to the situation in the game and it treats all games the same. Just as with WPA and cWPA above, we can take this one step further and measure the importance of the game by including the game’s championship leverage index (CLI). This number shows the importance of the game for each team, where the average game equals 1. If a win or a loss has a significant effect on the team’s playoff probability, the CLI will be greater than 1. By multiplying the LI and CLI, we can measure the importance that a play has on a team’s playoff probability. We’ll call this number pCLI (for championship leverage index by play). This number can be read as “how many times more important this situation was compared to the average play on opening day”.

Below are the top 10 most critical situations of the first half. As with the list above, some plays will appear twice, since they were important to BOTH team’s playoff chances.

Rk Date Team Inning Outs Runners Score pCLI Outcome Highlight
1
5/17 TEX Bot 9 2 Loaded 5 – 4 15.4 Khris Davis grand slam
2
6/12 WAS Bot 9 2 Loaded 4 – 3 14.2 Jayson Werth 1B
3
6/05 WAS Bot 9 2 Loaded 10 – 9 14.1 Ivan de Jesus fly out
4
6/24 TOR Top 9 2 Loaded 2 – 3 12.7 Michael Saunders pop out
5
6/11 SEA Bot 11 2 1st & 2nd 2 – 1 12.5 Kyle Seager fly out
6
5/17 TEX Bot 9 1 Loaded 5 – 4 12.2 Danny Valencia fly out
7
7/07 TOR Bot 8 2 Loaded 4 – 3 12.1 Troy Tulowitzki 1B
8
6/11 TEX Bot 11 2 1st & 2nd 2 – 1 12.0 Kyle Seager fly out
9
6/11 SEA Bot 10 2 Loaded 1 – 1 11.8 Ketel Marte fly out
10
5/06 BOS Top 9 2 Loaded 2 – 3 11.7 Hanley Ramirez strikeout
11
6/10 SFN Bot 9 2 1st & 2nd 3 – 2 11.5 Brandon Crawford strikeout
12
7/06 HOU Top 9 2 Loaded 8 – 9 11.5 Dae-Ho Lee strikeout
13
6/10 LAN Bot 9 2 1st & 2nd 3 – 2 11.5 Brandon Crawford strikeout
14
6/11 TEX Bot 10 2 Loaded 1 – 1 11.3 Ketel Marte fly out
15
6/05 WAS Bot 9 1 Loaded 10 – 9 11.1 Zack Cozart strikeout
16
6/05 BAL Bot 8 2 Loaded 1 – 0 10.9 Matt Wieters 1B
17
6/05 BOS Bot 9 2 1st & 2nd 5 – 4 10.9 Marco Hernandez strikeout
18
6/30 NYN Top 9 2 Loaded 3 – 4 10.8 Javier Baez pop out
19
6/18 TOR Top 9 1 Loaded 2 – 4 10.6 Josh Donaldson GIDP
20
6/21 BAL Bot 8 2 Loaded 7 – 6 10.6 Adam Jones ground out

If we revisit these lists at the end of the season, there is a good chance it will be dominated by second half plays. The reason for this is, just as the most important plays happen in the later innings of the game, the most important games occur near the end of the season. However, 2016 may be different since 5 of the 6 division leaders currently have at least a 5 game lead, which may lead to less enjoyable divisional races. For the sake of exciting plays and games, let’s hope some of these leads shorten.