Use Pythagorean Wins formulas to estimate the expected number of wins for a team, given their other statistics.
Ultimately, baseball comes down to wins. Winning is what the game is all about, and the only reason to measure anything else (batting, pitching, or fielding) is to measure its effect on wins.
Win estimates can be useful for measuring how lucky or unlucky a team was. (A lucky team will exceed its expected number of wins, and an unlucky team will have fewer wins.)
Many fans have developed different formulas for estimating the expected number of wins based on different statistics. In this hack, I’ll show a couple of the most popular formulas for estimating the number of wins and losses.
Bill James invented a formula for expected wins that has been nicknamed Pythagorean Wins (because of its resemblance to the Pythagorean theorem). The idea of this formula is that the expected win/loss ratio for a team is proportional to the square of runs scored to runs allowed:
How well does this equation do in practice? First, let’s solve for the expected number of wins using this formula:
Let’s measure the effectiveness of this formula using R. First, let’s load the team data and create a subset of the 2004 data:
> library(RODBC) ...