FanPost

2022 redux part 2: A 1000-iteration study of the 2022 season

Greetings everyone,

One of the burning questions in every offseason is just where all teams are talent-wise, and when teams are justified in making a push, standing pat, or it's time to cash in whatever chips the team has and start a rebuild. While the most obvious answer is the team's record from the concluded season, it doesn't always tell the whole truth in where a team's talent actually is- teams can get lucky or unlucky in any number of ways- from an offense that scores more or less than its underlying stats say they should or they win a bunch of close games while getting blown out in others. There are different ways to evaluate this, and in this post I will be using three.

One is simply using the Pythagorean theory (RS^1.83/(RS^1.83+RA^1.83) of their actual run totals, which gives us a rough idea of "cluster luck", or how fortunate a team was to get runs when they needed them to win games. Another is using the Run Created formula using a teams stats, for and against, and run it through the Pythagorean equation above. The third is using an Excel spreadsheet and some Visual Basic code I developed about a decade ago and running 1000 iterations of the 2022 season, using the players stats. The data used in this simulation comes from Retrosheet, which features play-by-play data for almost every MLB game played over the last 100 years. In the interest of brevity, I will spare the details of how exactly the spreadsheets and codes work, but will be happy to address any questions about them in the comments.

This is part two of the four-part series I'll be writing about in the 2022 season. Part one, which reviews miracle teams, can be found here.

AL EAST

Without any further ado, let's jump into results of these studies. I will start, as always, in the AL East:

Wild cards Division playoffs average min max real "luck"
NYA 149.67 848.00 997.67 103.94 86 121 99 -4.942
TOR 798.08 146.00 944.08 94.54 76 113 92 -2.538
TBA 390.17 6.00 396.17 82.53 64 102 86 3.468
BAL 30.67 0.00 30.67 72.61 51 91 83 10.395
BOS 31.33 0.00 31.33 71.65 52 95 78 6.35

Team Sim RA RC RA Actual RA Sim RS Runs Created Actual RS Sim pythag RC pythag Actual Pythag Actual
NYA 581.5 527.2 567 816.581 789 807 105.4 109.6 106.3 99
TOR 663.6 665.9 679 814.071 781 775 96.0 92.7 90.8 92
TBA 665.5 580.4 614 671.22 652 666 81.6 89.6 87.0 86
BAL 753.1 686.9 688 679.828 670 674 73.4 79.2 79.5 83
BOS 825.3 748.4 787 723.067 744 735 71.2 80.6 75.9 78

From the Orioles perspective, this isn't what you'd want to see- the 10.4 wins above the simulations is the highest number in this simulation. This suggests some regression will take place next year, but it is the outlier amongst the other data, which shows a 79-win talent level. The rest of the division is a mixed bag- Boston had the second-highest wins above the simulation, but other methods show the 78 wins they got were about right. The Yankees and Blue Jays both seemed to underperform, suggesting that the gap between the top and bottom of this division is more than meets the eye. Tampa seems to gotten a much from their pitchers, and was able to sneak away with the last wild card as a result. In fact, the spreadsheet got 11 of the 12 playoff teams correct- Tampa was the only one it missed.

Sidebar: Judge's HR chase

One of the things I tracked with is simulation is the number of HRs Judge hit in each iteration. He hit an average of 68.8, with a minimum of 44 and a maximum of 92(!). He cleared 61 HR 84% of the time, and hit more than 73, which would have established a new MLB record, 27% of the time. The odds of him hitting 62 HR given his 2022 home run frequency over 696 PA is of course 50%, so this difference is probably due to more PAs for Judge in the simulation, which give him a couple of more HR, and perhaps the way the odds are calculated in the simulation for each plate appearance. It is not due to the simulation causing more HR, the HR% in the simulation was the exact same as it was in the majors this year. If we use his career HR% instead his odds drop all the way to 3%. I'm not insinuating anything sinister here- I just think he was locked in for a long period of time this year to propel him to the record.

AL CENTRAL

Wild cards Division playoffs average min max real "luck"
CLE 162.33 775.00 937.33 94.55 75 117 92 -2.553
CHA 311.50 78.50 390.00 82.97 63 103 81 -1.97
MIN 429.42 146.50 575.92 85.60 66 108 78 -7.596
DET 1.00 0.00 1.00 64.28 47 84 66 1.719
KCA 0.00 0.00 0.00 62.88 44 81 65 2.117

Team Sim RA RC RA Actual RA Sim RS Runs Created Actual RS Sim pythag RC pythag Actual Pythag Actual
CLE 592.1 580.0 634 716.116 704 698 95.0 95.2 88.1 92
CHA 670.7 667.9 717 707.799 682 686 85.0 82.5 77.7 81
MIN 659.4 662.8 684 712.541 710 696 86.7 86.1 82.3 78
DET 724.7 668.7 713 565.794 537 557 63.0 65.0 63.0 66
KCA 856.8 807.7 810 640.184 654 640 59.9 65.5 63.8 65

In the AL Central, the Guardians reign supreme, although it wasn't a definitive runaway. The Twins had the largest underachievement relative to the simulations than any team in the MLB, and were able to catch the occasional division, while consistently getting wild cards. One oddity to note is how the pitching consistently gave up more runs than they should have across all teams in this division except the Royals. All in all, besides the Twins low run total, all teams played pretty close to what these methods suggested they would.

AL WEST

Wild cards Division playoffs average min max real "luck"
HOU 2.00 998.00 1000.00 108.73 89 127 106 -2.733
SEA 522.75 2.00 524.75 84.92 63 105 90 5.08
ANA 156.25 0.00 156.25 78.32 60 97 73 -5.317
TEX 14.83 0.00 14.83 70.80 52 88 68 -2.803
OAK 0.00 0.00 0.00 57.24 41 75 60 2.762

Team

Sim RA RC RA Actual RA Sim RS Runs Created Actual RS Sim pythag RC pythag Actual Pythag Actual
HOU 535.9 477.9 518 809.972 766 737 110.2 113.9 106.3 106
SEA 649.4 617.3 623 702.372 686 690 86.8 88.8 88.5 90
ANA 663.1 616.5 668 638.637 666 623 78.2 86.7 75.8 73
TEX 793.0 706.5 743 676.917 675 707 69.3 77.6 77.3 68
OAK 778.7 738.2 770 545.087 533 568 55.5 57.6 59.0 60

As has been the case the last several seasons I've done this study, the Astros are simply on another level then the teams within this division. Led by an all-world pitching staff and an adequate offense that appears to have underperformed by 50-70 runs, it appears they if anything they were a touch better than their record indicates. They are a legit dynasty at this point, and the World Series they won this year was no accident. Seattle put in a solid season, this study suggests no harsh regression is coming. I'm a bit curious by what the run created formula is saying about the Angels- I had to double check things to make sure the numbers were right. They were, although the spreadsheet disagrees and says they allowed almost exactly what they should have. Texas seems to be some distance away from competing, they'll need more than DeGrom to get anywhere near Houston, and Oakland's rebuild is seen in full force here- their 60 wins appear to be generous.

NL EAST

Wild cards Division playoffs average min max real "luck"
ATL 423.00 568.00 991.00 102.28 84 121 101 -1.277
NYN 591.50 393.00 984.50 100.54 80 122 101 0.46
PHI 685.67 39.00 724.67 90.70 73 110 87 -3.698
FLO 2.42 0.00 2.42 68.14 50 88 69 0.861
WAS 0.00 0.00 0.00 56.51 40 76 55 -1.511

Team Sim RA RC RA Actual RA Sim RS Runs Created Actual RS Sim pythag RC pythag Actual Pythag Actual
ATL 575.4 569.9 609 801.323 794 789 104.8 104.8 99.8 101
NYN 603.9 592.3 606 815.023 777 772 102.7 100.7 98.7 101
PHI 646.5 641.7 685 753.211 749 747 92.2 92.4 87.4 87
FLO 714.3 673.1 676 587.159 590 586 66.6 71.3 70.5 69
WAS 884.2 844.9 855 619.563 643 603 55.5 61.2 56.0 55

In this year's simulations, the National League the average wins for each team was particularly close, no one was more than 4.5 wins away from their actual totals. Although very close, it appears the Braves were indeed just a nose better than the Mets, with the Phillies also putting in a good year albeit at some distance from the top two, although 87 wins for them may have been a bit light thanks to some inefficient pitching. Washington won the least amount of games in any method. Shucks. They did manage to avoid the worst simulated record in the 5.5 seasons I've simulated (1977 AL, 2011, 2015, 2018, 2019 and this year) by good margin- that record is safely held by the 2018 Tigers who only got to 47 wins.

NL CENTRAL

Wild cards Division playoffs average min max real "luck"
SLN 182.50 733.50 916.00 95.30 78 114 93 -2.298
MIL 392.92 260.50 653.42 89.31 71 110 86 -3.314
CHN 16.00 6.00 22.00 74.52 52 97 74 -0.518
PIT 0.00 0.00 0.00 62.23 45 85 62 -0.23
CIN 0.00 0.00 0.00 61.23 42 82 62 0.774

Team Sim RA RC RA Actual RA Sim RS Runs Created Actual RS Sim pythag RC pythag Actual Pythag Actual
SLN 669.3 630.5 637 829.377 787 772 96.7 97.2 95.1 93
MIL 673.5 629.3 688 765.48 718 725 90.4 90.7 84.9 86
CHN 728.9 710.6 731 661.63 662 657 73.8 75.8 73.1 74
PIT 785.6 761.8 817 592.729 588 591 60.6 62.2 57.7 62
CIN 854.3 794.6 815 623.134 625 648 58.2 63.5 64.3 62

The NL Central went almost perfect to form, there's no real outlier here except the Cardinals and Brewers offenses probably should have scored a couple more runs and therefore won another game or two, and the Brewers gave up a couple more runs than they should have. All in all, no surprises here.

NL WEST

Wild cards Division playoffs average min max real "luck"
LAN 13.00 987.00 1000.00 108.70 90 126 111 2.301 NLW
SDN 573.75 13.00 586.75 88.74 62 110 89 0.26 NLW
SFN 110.25 0.00 110.25 79.77 60 98 81 1.227 NLW
ARI 9.00 0.00 9.00 72.95 52 91 74 1.046 NLW
COL 0.00 0.00 0.00 63.52 45 82 68 4.478 NLW

Team Sim RA RC RA Actual RA Sim RS Runs Created Actual RS Sim pythag RC pythag Actual Pythag Actual
LAN 530.3 480.3 513 815.682 864 847 111.4 120.8 115.8 111
SDN 646.1 620.6 660 736.31 701 705 90.6 90.0 85.9 89
SFN 675.4 663.6 697 674.934 695 716 80.9 84.4 83.0 81
ARI 759.0 699.3 740 672.957 659 702 72.1 76.6 77.1 74
COL 896.5 829.7 873 687.692 690 698 61.7 67.5 64.6 68

For the first time, the Dodgers won more games than the spreadsheet predicted they would, in years past they underperformed the spreadsheet's expectations by 10-17 games. However, there is still some suggestion of underperformance by other methods- they outscored their opponents by 334 runs in real life and "only" won 111 games. These Dodgers broke the record for most wins in a season (currently 116 by the '01 Mariners) 84 times out of the 1000 iterations, slightly ahead of the 82 that Houston did this year. San Diego's good season was legit, as was San Francisco's... averageness. The 107 wins they had last year, as outlined in the first entry of this season, represented one of the ten miracle teams outlined since the beginning of the 20th century. These results here show that last year was probably an outlier, and when combined with their age show that the team needs help to get competitive. The Dodgers, as has been the case for most of the last decade, are just in another class in this division.

Conclusions

From the Orioles perspective, this study shows that the team overachieved last year. However, it shouldn't be discounted the amount of progress that was made last year- real progress- in terms of getting the team back to contention. In 2019, the last year I ran this study, they averaged 60 wins and had no real help in minors. This year they won 73, and there's reason to believe the spreadsheet may have been a bit harsh on them. Combine that with the youth of team, the farm system that figures to be injecting talent into the Baltimore next year and beyond and it's clear they are on a good path. If they wanted to more serious about their chances for 2023 they would have been more active in free agency.

Looking at the results as a whole, I must say I'm pleased with the results of the simulation, if anything it was too good. The standard deviation of any team is around 6 wins, so if we apply that to 30 teams I'd expect about 10 teams to be ±6 wins from their actual totals, this year there was only 3, and all of them were in the AL. Unlike past years I can't really single out any team as being an obvious candidate for regression/progression, everyone was pretty close to where they should have by any method, except for maybe Baltimore, but I just can't write them off for the reasons mentioned above. If you're looking for a dark horse to perhaps take a step forward, I would say the Twins, but Cleveland is still a distance ahead of them.

This is the first 1000-simulation study I did over this past weekend. The other one was a 25 players/25 years league, where one player in each season going back to 1997 (except for 2020) was selected from each franchise then played against each other in a fictional 162 game season. I'll be releasing the results of that in a separate fanpost tomorrow.

Until then, thanks for reading!

tl;dr version:

Well, at least we're better than Boston now

FanPosts are user-created content and do not necessarily reflect the views of the editors of Camden Chat or SB Nation. They might, though.