Tuesday, April 5, 2016

Baseball Mogul 2016: Rating Calibration

Baseball Mogul 2016 has a new feature that adjusts all stats for the statistical environment in which they are accrued before assigning player ratings. This applies to the current season and to historical seasons, but it also calculates major league equivalencies (MLEs) for the 1.2 million lines of minor league stats included in the game.

The New Rating Scale

The first thing you will notice is that everyone’s ratings have dropped by about 6-8 points. The average rating for a major league player in Baseball Mogul Diamond was about 82. The first version of Baseball Mogul was originally designed with the average player rating set at 75 (corresponding to a grade of "C" in an academic environment). But it crept up over the years until more than half of all players were clumped between 81 and 86, making it difficult to differentiate between an average player and a very good player. In last year's game, the distribution of ratings for major league players looked like this:

Player Rating Distribution in Baseball Mogul Diamond
Making league and stadium adjustments for every stat gives us the ability to rate all players on the same scale, in the same way that a stat like OPS+ defines '100' as the league average. In addition (and unlike OPS) we can specify the distribution of player ratings on this scale. That is, we can specify the degree to which all player ratings are either clumped near the average rating or spread out over the entire scale.

I used this opportunity to re-establish 75 as the major league average for all player ratings and to set the standard deviation at 7. Assuming a normal distribution, ratings for players at the major league level are now distributed on a “bell curve” like this:
Player Rating Distribution in Baseball Mogul 2016
As you can see on the graph, about two thirds of ratings fall between 68 and 82. Any rating of 90 or higher describes an ability in the top 2% of major league talent.

Fitting all player ratings to this scale means that any specific number has the same meaning regardless of whether you are playing in 1927 or 2027. 
For example, a pitcher's Power rating is primarily based on their projected strikeout rate. If a pitcher strikes out batters at a rate that is one standard deviation above the league average, he is assigned a Power rating of 82 (75 + 7).

Historical Adjustments

Player ratings now use the same scale regardless of which season you are playing in (and regardless of how many seasons you play into the future). This differs from previous versions of Baseball Mogul which had different average ratings in different seasons. As league averages went up and down, so did player ratings. For example, the strikeout rate in Major League Baseball has risen more than 65% over the last 25 years. This translated to a big jump in Power ratings for all pitchers. If you start a new game in 2015 using Baseball Mogul Diamond, the average pitcher has a Power rating of 83. But if you start in 1981, the average Power rating is only 70. This discrepancy means that the 1981 season has only nine pitchers with an Overall rating of 90 or more, compared to 56 such players in 2015!

Impact on Gameplay

This volatility was more than just a cosmetic problem. It could lead to a serious imbalance between batters and pitchers. If you start a game in 1969 you will see that pitching ratings are noticeably higher than batting ratings -- because run-scoring was at an all-time low in the late 1960s. Of forty players rated 90 or higher in the 1969 database, only seven (17%) are batters. Because player ratings are part of the game’s artificial intelligence and salary negotiations, this imbalance made it possible to take advantage of the computer-controlled general managers by trading away below-average pitchers in return for above-average hitters (and then signing those hitters at lower salaries than pitchers with the same win value were asking for).

The Shocker!

Baseball Mogul 2016 re-calibrates all ratings when loading a game saved in the previous versions. This means that a team of players rated between 82 and 88 could become a team with of players rated in the high 70s. These rating changes may come as a shock -- in the same way that you would be stunned if your employer suddenly cut your pay by 8% for no reason. I have noticed that I can rewrite the code for determining team revenue or ticket sales or player salaries without getting a single complaint. If the number of outfield assists or complete games goes down by 8% between versions, no one notices. But if the ratings for your players on your team drop by 8%, it seems like the game has undone all the hard work you put into building your team. It's worth remembering that the underlying player talent level is completely unaffected by this adjustment, which applies equally to every player in the database. This is a significant change, but it was a long-awaited fix to the problems of inexact player ratings and unpredictable talent distribution – problems that leaked over into everything from player evaluation to managerial decisions.

No comments: