Pittsburgh Pirates: Run Differential and Pythagorean Record

Run differential is a great way to see how a team scores runs compared to preventing them in a snapshot of time. But how does that metric hold up in predictive power?

The Pittsburgh Pirates got off to a hot start before cooling down and dropping eight straight games, and have since played like a .500 club since. In their cool down period the team has seen their run differential drop; and on the year they’ve been beat by scores of 10-0, 12-4, 11-2, 14-1, and 17-4 while their biggest win was 5-0 in game two of the season.

Entering the game on Friday the night, the Pittsburgh Pirates have allowed 165 runs to scoring 124, and their -41 run differential ranks fourth worst, and the only National League team with a worse run differential is the Miami Marlins:

MLB Run DifferentialAmerican LeagueWestRDCentralRDEastRD

Houston Astros

49Minnesota Twins44Tampa Bay Rays59Texas Rangers17Cleveland Indians-9New York Yankees34

Seattle Mariners

12Kansas City Royals-12Boston Red Sox1Los Angeles Angels-7Chicago White Sox-33Toronto Blue Jays-26

Oakland Athletics

-7Detroit Tigers-42Baltimore Orioles-69

National LeagueWestRDCentralRDEastRD

Los Angeles Dodgers

45Chicago Cubs57Philadelphia Phillies31Arizona Diamondbacks18St. Louis Cardinals26Atlanta Braves-4

San Diego Padres

1Cincinnati Reds23Washington Nationals-17Colorado Rockies-17Milwaukee Brewers2New York Mets-27

San Francisco Giants

-29Pittsburgh Pirates-41Miami Marlins-79

Run differential has been a staple, Paul DePodesta took a look at it with the Oakland Athletics, with this section of Moneyball detailing his work,

"“Before the 2002 season, Paul DePodesta had reduced the coming six months to a math problem. He judged how many wins it would take to make the play-offs: 95. He then calculated how many more runs the Oakland A’s would need to score than they allowed to win 95 games: 135. (The idea that there was a stable relationship between season run totals and season wins was another Jamesean discovery.) Then, using the A’s players’ past performance as a guide, he made reasoned arguments about how many runs they would actually score and allow. If they didn’t suffer an abnormally large number of injuries, he said, the team would score between 800 and 820 runs and give up between 650 and 670 runs. From that he predicted the team would win between 93 and 97 games and probably wind up in the play-offs.”"

Scoring runs is preventing runs is a simple concept, and it’s the root of the game. You have 27 outs to outscore the other team, and outs are essentially a currency teams use for the chance of scoring runs, though there is no such thing as a productive out,

"“Do the team’s abilities in making productive outs have any relationship with their projected run totals? The question is slightly confused, in that SH and SF are included in EqR, but there was essentially no correlation between the difference in productive out deltas and EqR deltas – and what little correlation there was slightly negative…. If you get enough hits and homers, it doesn’t matter how often someone moves from second to third on a grounder.”"

The Pirates offense hasn’t been good so far, and their pitching staff has been roughed up at time. The club has the following distribution in their run differential per game:

When the Pirates win, it’s by either one or two runs; of the first 34 games, 22 have come within two runs and the club is 13-9 in such games. The problem comes with their blowout games, primarily that they are 1-6 in games decided by more than five runs, being outscored 69-16. Add this up and you get that -41 run differential through their first 34 games.

As illustrated above with the early 2000s Oakland Athletics, run differential has been a staple, and it has created a metric called Pythagorean Win Percentage. Baseball-Reference writes,

"“The rationale behind Pythagorean Winning Percentage is that, while winning as many games as possible is still the ultimate goal of a baseball team, a team’s run differential (once a sufficient number of games have been played) provides a better idea of how well a team is actually playing. Therefore, barring personnel issues (injuries, trades), a team’s actual W-L record will approach the Pythagorean Expected W-L record over time, not the other way around…Nevertheless, given that advocates of the theorem point to teams that exceed their predicted number of wins as having done so due only to random chance, it is questionable whether the theorem provides anything indicative with respect to an individual team during a given season, as opposed to being a construct that shows the general relationship between scoring runs and preventing runs in winning baseball games.”"

The formula for the Theorem is [(Runs Scored)^1.81]/[(Runs Scored)^1.81 + (Runs Allowed)^1.81], which plugging in the Pirates 124 runs scored and 165 runs allowed gives the Pirates a Pythagorean win percentage of 13-21. But again, most of their games (65 percent) have been within two runs, so how predictive is this theorem in determining wins?

Looking at the data from 2010-2018 and splitting it by the All-Star Break (first half compared to second half), how well does the first half winning percentage compare with the first half Pythagorean win percentage in predicting the second half percentage?

Looking at the Pearson correlation coefficient is the linear correlation between two variables; in this case our first half win percentage or Pythagorean win percentage and the second half winning percentage. The values are as followed:

Correlation to Second Half Winning Percentage
	Pearson	MOE	CI
WP1H	0.52	0.09	[0.43,0.61]
Pythag1H	0.54	0.09	[0.45,0.62]
*2010-2018

The Pythagorean win percentage does have a higher correlation in predicting the second half win percentage, but the first half win percentage falls comfortably within the margin of error. We really can’t say that Pythagorean win percentage is a better predictor of second half win percentage than the winning percentage in the first half.

It appears that Pythagorean win percentage, and as a result run differential, really just show that to win games you have to score runs and prevent them. The Pirates have been beat badly in six of their losses, and only have beaten a team badly in one of them. As a result, their run differential is poor and their Pythagorean win percentage suffers as a result, but it’s no better predictor going forward than their win percentage. PECOTA pegs the Pirates before Friday’s game as going 63-65 the rest of the season and ending 80-82, which intuitively seems about right for a club that looks average on paper.

*Numbers entering Friday, May 10

Run differential is a great way to see how a team scores runs compared to preventing them in a snapshot of time. But how does that metric hold up in predictive power?

Schedule