In my Weekly Wrap posts here at Camden Chat, I wrote about how the team did during the previous week. For the majority of the season, I also included a predicted wins total. During the season a few people asked how I got these prediction numbers, and I thought the explanation deserved a full post.
There are a number of systems out there. I chose one that was detailed by Neil Paine at FiveThirtyEight because it made sense to me and I didn’t just want to copy and paste FanGraphs’ numbers.
The basics are this: given a team’s current winning percentage, you can estimate their final win total by:
- Regressing that winning percentage towards the team’s true talent level, to find a regressed winning percentage. This is done by assuming they will play their next 67 games at their true talent level. Refer to the article for links to proofs.
- Applying that regressed winning percentage towards the team’s remaining games. This gives the number of games we expect the team to win for the rest of the season.
- Add the win total found in step 2 to the team’s current win total
The idea is that at any time, a team’s won/loss record encapsulates not only their true talent, but also luck and measurement error. Stated another way, a team’s won/loss record is an imprecise way of measuring their true talent, in the same way that ERA is an imprecise way of measuring a pitcher's true talent. The concept of regression to the mean states that going forward, a team will more likely play at their true talent level than at talent-level-plus-luck-and-error.
The key factor with this model, as with any model, is its input (also known as a "prior"); in this case, the 2014 Orioles’ true talent level. You have to make this observation before the season, and you really shouldn’t change it unless there’s a significant reason, like, say, Matt Wieters going on the shelf or Manny Machado returning from an injury.
For my part, I felt in my proverbial gut before the 2014 season that the Orioles were a .500 team. But I didn’t just use that data point; I also found several pre-season estimates of the team’s 2014 record: Joe Sheehan’s newsletter, the team’s 2013 record, PECOTA, FanGraphs, Bleacher Report, David Schoenfield from ESPN, and Clay Davenport’s predictions. Averaging all these predictions together spat out 81.57 wins. I rounded that down to 81 wins, which is a .500 record. Early in the season, I nudged that input up to .510 when it was clear the Orioles were on a roll. I can’t quite remember when I did that, but I think it was when Manny Machado returned from the DL.
For example: after 104 games, the Orioles were 58-46. That’s a .558 winning percentage, but at the time I had them as a .510 team. The regressed winning percentage is ((.510*67)+58) / (67+58+46) or .539. Over the team’s remaining 58 games, .539 ball is 31 wins. 31 + 58 is 89 wins (and 73 losses). That’s a a .551 winning percentage over the course of the whole season.
That’s the process I used. For this article, I compiled all my weekly predictions, inserting a few where I hadn’t made any by using the Orioles’ record on that date, and compared them against the final total of 96 wins.
Expressed as the error in number of games:
I think the formula worked well. It certainly isn't perfect; it took until around the All-Star Break for the formula to really begin believing in the Orioles, but that's around the point where two things happened: the team began separating itself from the pack and the team's actual record drowned out their predicted/regressed record.
89 games into the season, the formula was within eight games. In some years, eight games is a big margin of error, but in 2014 the rest of the AL East was playing so poorly that eight games ended up being plenty of room. (I didn't run the predictions for the other teams in the division.) Later, with a month and a half of baseball left, the error was at four games. That is, the formula was predicting a 92-win season when the Orioles were 67-50.
The error dropped to three games with a month of baseball left and was down to one game in the final three weeks of the season. With 13 games left, the final prediction stood at 97 wins, so you could say the Orioles actually underperformed a bit towards the end of the season. By then they were resting their regulars, having clinched the division long before.
I enjoyed this experiment, and I hope you did too. I'm looking forward to seeing how this formula performs in 2015 and comparing the results to 2014's run.