Saturday, June 13, 2009

Pythagorean Puzzle

click on image for a larger view

Don’t worry about the charts. Read this first. If you don’t go to sleep, then came back and take a look at the charts.

A while ago I read a blog post that was worrying about the Tampa Bay Rays’ actual performance as compared to their “Pythagorean Expectation”. That led me to a more diligent (well, sort of) effort to understand just what the heck he was talking about. I don’t think I succeeded, but I did come up with a puzzle I’ll put out to my vast readership. Maybe you can figure it out.

The Pythagorean Expectation is based on the idea that there ought to be some kind of relationship between runs scored by opposing teams and their won-lost records (other than just “at least one more”). It is one of the many innovations that Bill James and his Sabermetricians have brought to the game and is described in a Wikipedia article here. If you really want to dive into it, try some of the links in that article.

As it happened, I had all the numbers for both Bulls and Rays since I use them for the occasional charts that I put up. So I ran “expectations” for the Bulls and the Rays and came up with what looks very weird to me.

In the first chart is the Bulls actual won-loss record and the expectation computation. You can see that the “actual” runs well above the “expectation” (.069 as of game 61). In the Rays chart the actual runs well below the expectation (-.086 as of game 63). What’s going on?

I’ve come up with a couple of possibilities.
  • I screwed up the numbers. Possible, but I am sure that at least the last datum in each is correct since it comes from the official records.
  • That it’s early in the season and that the curves will eventually meet. That’s the “in the long run” argument, as in, “In the long run life kills you.”
  • That the expectation formula doesn’t do a good job accounting for one-run wins and losses. That seems plausible since the Bulls have had 23 1-run games and won 15 of them, while the Rays have had 20 1-run games and won only 7 of them.
  • Blame the bullpen (for the Rays) or praise the bullpen (Bulls). I kind of like that, but not sure how to drill down into the numbers to test it. It is, when you think about it, just another way of expressing the 1-run game idea above.
  • Give the managers all the credit (blame). An interesting way to test this idea would be to swap Maddon for Montoyo for a couple of weeks (everyone else has been sent to Tampa Bay, why not the manager?). If the Rays started winning and the Bulls started losing then we’d know.
  • That the whole Pythagorean idea doesn’t pass the “so what?” test and get on with my life.

At any rate, here it is. What do you think? How come we’re doing so much better and the Rays are doing so much worse?


  1. I think that the point about one-run losses is the most important one here. I think the model expects things like that to even out, and they really aren't even for either team.

    I think the better question is why the Bulls are so much better in those close games than the Rays are. Bullpens? Based on the reports of the Rays interest in Pedro Martinez and the recent sigining of Julio, I would say that the Rays front office is significantly concerned about the pen. I wonder if that will mean more call-ups?

  2. Rays have got Bradford and Shouse coming back (someday), so maybe they'll leave us alone. However, if they decide they need a closer, Abreu is a goner, I think. Will be interesting to see how these numbers play out over the season. I'll update them in a couple of weeks. If I get really ambitious (or bored), I think I still have then numbers for 2008 and I could go back and look at them.

  3. Forgot to add that the Rays are going to have to decide what to do when Kazmir comes of DL. Assume they'll move a starter to bullpen then, and probably not as a closer.

  4. According to a twitter from RaysProspects, Abreu got the callup...

  5. Update: Izzy to DL, Abreu to the Rays. Good for him, he earned it. I will miss him, though. I wonder if we get a replacement from Montgomery?

  6. Sounds like you are expecting regression to the mean; the "long run" theory.

    Good for Winston, but my word, we'll miss him.

  7. Regression to the mean it is. What we can hope for is that the trend for both measures is in the direction of .600 or so. Then we're in.

    Have to add, however, that James and his cohorts are dealing with very large data sets, couple thousand games per year. Here we looked at only 60.

    About Winston ... will have a post up on that soon.