# Baseball Stats, Anyone?

The first twenty games for most major league baseball teams have been played, so there isn’t a better time to do some analysis on how each team has done so far. If you read my Pythagorean Theorem story for predicting the 2016-17 NFL season, you’ll be familiar with that here. Since we are in the sabermetrics generation of sports (across all of the major four), using that calculation will make the most sense in baseball. If you don’t believe me, just watch Moneyball (one of my favorite analytics sports movies of all time).

No, seriously. If you haven’t seen that movie, download it or rent it and watch it now. Yes, right now.

The formula that most statisticians know with the Pythagorean theorem as it relates to baseball stats is:

Predicted Win Percentage = Runs Scores^2 / (Runs Scored^2 + Runs Allowed^2)

If we go back and use the last twenty years of baseball stats, it’s found that actually 1.81 is a better exponent for predicting the number of wins that a team should have. This will essentially tell us if a team is over or under performing based on how many runs they score in a game.

Let’s apply this to the first twenty games of the 2016 MLB season.

Team Games Played Runs Scored Runs Allowed Actual Wins Actual Losses Predicted Wins Predicted Losses
ARI 21 109 105 11 10 10.85513951 10.14486049
ATL 19 61 93 4 15 6.040578 12.959422
BAL 18 81 69 11 7 10.29690072 7.703099283
BOS 19 92 91 10 9 9.59395962 9.40604038
CHC 19 119 51 14 5 15.62814142 3.371858578
CHW 20 68 52 14 6 12.38118821 7.618811793
CIN 20 80 122 9 11 6.356393433 13.64360657
CLE 17 74 66 9 8 9.37696992 7.62303008
COL 19 100 115 9 10 8.304763845 10.69523615
DET 18 80 78 9 9 9.206177467 8.793822533
HOU 20 78 99 6 14 7.875249691 12.12475031
KCR 19 72 61 12 7 10.91479789 8.085202115
LAA 20 63 76 9 11 8.31835887 11.68164113
LAD 20 96 74 12 8 12.31294029 7.687059708
MIA 18 67 80 7 11 7.567888619 10.43211138
MIL 19 76 114 8 11 6.162484137 12.83751586
MIN 20 66 86 6 14 7.649323987 12.35067601
NYM 18 80 56 11 7 11.80825394 6.191746062
NYY 18 69 80 8 10 7.802336278 10.19766372
OAK 20 69 77 10 10 9.010473597 10.9895264
PHI 19 62 88 9 10 6.586052871 12.41394713
PIT 20 101 95 11 9 10.55368795 9.446312049
SDP 20 76 106 7 13 7.076822421 12.92317758
SEA 19 79 69 10 9 10.65781306 8.342186936
SFG 21 104 94 10 11 11.45799429 9.542005713
STL 19 118 84 10 9 12.33322569 6.666774312
TBR 19 68 68 9 10 9.5 9.5
TEX 20 82 83 10 10 9.890306087 10.10969391
TOR 21 91 83 10 11 11.37239381 9.627606189
WSN 18 80 45 14 4 13.30419243 4.695807568

For a quick example of how the table above works, let’s take the Chicago Cubs.  They’ve played twenty games and have scored 119 runs while giving up 51.  With the Pyth formula, they were expected to have won 15.6 of those games.  They won 14, which means they are slightly under-performing by 1.6 games.  Not bad – I’ll take it.  That made me wonder…

Is that a lot?  Is that a little?  Are there any other teams that are considerably over or under-performing the 2016 season so far?  That brings me to this graph below.

The graph above shows a couple things.  Here’s a better legend.

Red line: shows difference between predicted wins vs. actual wins.

Dark Gray line: shows difference of those differences (whoa, complicated)

Orange/Light Gray line: shows one standard deviation away from that dark gray average.

Yellow/Blue line: shows two standard deviations away from that average in dark gary.

Ok, I completely lost you.  But it’s actually not that tough.  If you look at the highest points, you’ll see teams that have under performed the most so far this season.  Essentially the pythag value indicates that they should have a better record than what actually have.  St. Louis, Atlanta (per usual) and Houston are the three teams with the largest positive difference between the wins they were predicted to win versus the number they actually won, while Cincy and Phily are the two teams that are over performing based on their runs scored/allowed metrics.

I’m interested to hear your feedback on the stats that I compiled.  Let me know what you think in the comments section below.