The first twenty games for most major league baseball teams have been played, so there isn’t a better time to do some analysis on how each team has done so far. If you read my Pythagorean Theorem story for predicting the 2016-17 NFL season, you’ll be familiar with that here. Since we are in the sabermetrics generation of sports (across all of the major four), using that calculation will make the most sense in baseball. If you don’t believe me, just watch Moneyball (one of my favorite analytics sports movies of all time).
No, seriously. If you haven’t seen that movie, download it or rent it and watch it now. Yes, right now.
The formula that most statisticians know with the Pythagorean theorem as it relates to baseball stats is:
Predicted Win Percentage = Runs Scores^2 / (Runs Scored^2 + Runs Allowed^2)
If we go back and use the last twenty years of baseball stats, it’s found that actually 1.81 is a better exponent for predicting the number of wins that a team should have. This will essentially tell us if a team is over or under performing based on how many runs they score in a game.
Let’s apply this to the first twenty games of the 2016 MLB season.
Ready? I sure am.
|Team||Games Played||Runs Scored||Runs Allowed||Actual Wins||Actual Losses||Predicted Wins||Predicted Losses|
For a quick example of how the table above works, let’s take the Chicago Cubs. They’ve played twenty games and have scored 119 runs while giving up 51. With the Pyth formula, they were expected to have won 15.6 of those games. They won 14, which means they are slightly under-performing by 1.6 games. Not bad – I’ll take it. That made me wonder…
Is that a lot? Is that a little? Are there any other teams that are considerably over or under-performing the 2016 season so far? That brings me to this graph below.
The graph above shows a couple things. Here’s a better legend.
Red line: shows difference between predicted wins vs. actual wins.
Dark Gray line: shows difference of those differences (whoa, complicated)
Orange/Light Gray line: shows one standard deviation away from that dark gray average.
Yellow/Blue line: shows two standard deviations away from that average in dark gary.
Ok, I completely lost you. But it’s actually not that tough. If you look at the highest points, you’ll see teams that have under performed the most so far this season. Essentially the pythag value indicates that they should have a better record than what actually have. St. Louis, Atlanta (per usual) and Houston are the three teams with the largest positive difference between the wins they were predicted to win versus the number they actually won, while Cincy and Phily are the two teams that are over performing based on their runs scored/allowed metrics.
I’m interested to hear your feedback on the stats that I compiled. Let me know what you think in the comments section below.
If you liked this article, don’t forget to follow us on Twitter or Facebook, @LebortsReport.
Subscribe for more like it! Thanks for the support!