Wednesday, May 30, 2012

Baseball Mogul's Simulation Engine

If you're a fan of the Baseball Mogul series, you might be wondering why, after 17 years, I chose this year to fully rewrite the simulation engine.

Well, first, some history:

Baseball Mogul was first written in 1995 as a program that simulated each plate appearance in a game. I created a table containing the batter's chance of each possible result (strikeout, home run, single, walk, etc.), and then created a similar table for the pitcher, based on the pitcher's stats.

You can write a pretty good baseball simulation like this. If Manny Ramirez hits a home run in 5.68% of his plate appearances, and Bartolo Colon allows 17% fewer home runs than the league average, then Manny has a 4.71% chance of hitting a home run against Bartolo Colon (5.68 minus 17% equals 4.71).

There are other adjustments. For example, we know that the average platoon differential is about 27 points. So, if Manny Ramirez is facing a lefty, his batting average should be about .027 higher than when he faces a righty. If we have actual lefty-righty splits for Manny's career stats, we can use those instead of just using the league average. If Manny is batting in a stadium where right-handers hit 22% more homers, then we also make that adjustment.

There is some other math involved, like when your table of results doesn't add up to 100%. But you get the idea. The results look realistic. But they are just numbers on a spreadsheet.

This is the most obvious way to build a simulation game. Baseball box scores show the result of each at bat. So if you build a simulation that generates results for each at-bat, it looks realistic to the average baseball fan. But this is qualitatively the same way we've been simulating baseball for 50 years, going all the way back to Strat-O-Matic: roll some dice for each at bat, and look up the results on a table.

Strat-O-Matic Player Cards
However, baseball isn't played one at-bat at a time. To get a truly realistic simulation of a baseball game, you need to simulate each pitch. Imagine if you built a basketball simulation that simulated each possession instead of each play. The output would look like this:

  • Boston scores 2 points.
  • Los Angeles scores 2 points.
  • Boston scores 3 points.
  • Los Angeles scores 0 points.
  • Los Angeles scores 2 points.
  • Boston scores 0 points.
  • Los Angeles scores 2 points.
  • etc.

I also can't tell from this output if the underlying simulation is realistic, or if it's just a random number generator hooked up to a table of possible results.

Early versions of Baseball Mogul had player ratings like "Avoid Strikeouts". That clearly has nothing to do with the actual game of baseball. It was just statistical shorthand for saying "this player has a low number of strikeouts". In real life, a player is able to avoid strikeouts by a combination of actual skills such as: reading the pitch; making contact; and "shortening up" with two strikes. The game also needs to take into account other factors, such as the ability of a power hitter to force a pitcher to nibble at the corners, thus leading to more walks and fewer strikeouts. Dustin Pedroia might get a fastball down the middle of the plate on a full count; with Jose Bautista, this is far less likely.

Any truly realistic baseball simulation needs to simulate each pitch. So, when we added "Player Mode" (aka "Pitch-by-pitch mode") in Baseball Mogul 2007, I switched from using random numbers to using physics to determine the result of each pitch. Each pitch had a velocity and spin. Each pitcher has an ability to hit his spots in the strike zone. Each batter had an ability to guess whether or not each pitch would be a strike, and had differing abilities to make contact, put the ball in play, and drive the ball with authority.

This resulted in a program that used different simulation engines for different parts of the game. Player Mode (aka "Pitch-by-pitch mode") used physics and all other modes used a lookup table. This solution created some bugs, like this one.

But it also doesn't make sense that a "realistic" baseball simulation would have two different ways to simulate each in-game event. So, in order to fix these problems, I had to choose one of the following options:

1) Convert Pitch-By-Pitch mode to somehow use the same "one die roll per plate appearance" that I used when I wrote a baseball simulator in 1985 (and that the hobby has been using since the days of dice-based simulation games).


2) Convert the entire game to use the same true-to-life 3-D physical simulator that the pitch-by-pitch mode was using.

In other words, do we go "backwards" to the simulation technology of the 1950s? Or do we go forward, creating a baseball game that isn't just a random number generator linked to a big table of results?

I chose option #2 and the results are dramatic. I agree. As I continue to improve all aspects of the game, it's important to know that the underlying simulation is truly realistic, and not just some computer code that combines player ratings with random numbers and spits out a result.


Anonymous said...

I've found lots of unrealistic details in Baseball Mogul and I've commented about them in various Amazon reviews that I've written. Baserunning has always been a disaster. Take stolen bases, for example: runners got caught stealing much too often, and runners who got caught stealing too often also attempted to steal too often, so by season's end they'd have a ridiculous number of times caught stealing. Then there was the notorious problem with bunting players from second base to third: it NEVER worked, even though this is not a rare play in certain late-game situations. While I applaud your efforts to make each plate appearance more realistic, there is much more to baseball than just the encounter between batter and pitcher.

Clay Dreslough said...

Thanks for the comment. The good news about changing to a pitch-by-pitch is we had to change all the Artificial Intelligence too. For example, the decision to steal is now calculated on each pitch. So if a batter is given the green light he can wait until he gets the best jump.

Sacrificing a runner from 2B to 3B now works about 85% of the time, depending on the bunter.

Download the demo if you'd like to see the results yourself:

Anonymous said...

My major problem is the lack of ability to throw a pitchout. The AI does it all the time (and catches my runners stealing all the time with it), so why can't we do it as well?

RancerDS said...

Due to your trying to keep Baseball Mogul as realistic as possible without over-complicating it; that's what makes it the best baseball sim (in my book). So when I suggest a baseball sim to someone, this is the one I recommend. Just as there are many options for baseball sims, having an option like Strat-O-Matic may still appeal more to certain people than computer-based simulations. But I'm really glad you were willing to take another hard, long look at how it functions, even after all this time.