Thursday, March 1, 2012

Baseball Mogul 2013: Under The Hood, Part 1

When I first wrote Baseball Mogul, it simulated each game by simulating the result of each plate appearance. This isn't unusual. This is how Strat-O-Matic works. This is how other computer games work. This is even the method I used when writing my very first baseball simulation, using paper and dice, back in 1976.

But the thing is, baseball isn't played with paper and dice. It's played inside televisions. And the game on television isn't determined by comparing player stats and generating a random number. It's determined pitch-by-pitch. Each pitch has a velocity, a spin direction (and magnitude), and the location where it crosses the plate.


Photo by Wall Street Journal

So, for Baseball Mogul 2013, I rewrote the entire simulation engine to calculate:
  1. The velocity and path of each pitch (similar to that recorded by PITCHf/x).
  2. The timing and velocity of the bat swing.
  3. The plane of the bat swing (and the location of its sweet spot).
  4. The angle and velocity of the hit that results from the above.
Needless to say, this wasn't an easy task. It's easy to determine the outcome of an at-bat from player stats. It's pretty basic math. You do need to do some Bayesian analysis if you want maximum realism. But other than that, you are just multiplying some numbers.

But to simulate each pitch, I had to break down each player's talent into abilities that influence the outcome of each. In real life, pitchers don't have the ability to "cause strikeouts". Instead, they have the ability to throw the ball with a certain velocity and spin, to a certain spot in the strike zone, and a certain chance of successfully executing the pitch.

Similarly, batters don't have the ability to "prevent strikeouts". What they have is some level of ability to recognize the pitch and decide whether to swing, and a different ability to actually make contact. And these abilities change with the count, as the batter gets more aggressive in hitter's counts, and more defensive in pitcher's counts. You can derive these abilities from each player's historical stats. But it's not easy -- and it's something I've been working on over the course of several years.

The good news is that once you get this right, you see an incredible amount of realism emerge from calculating the results of each pitch.  For example, we know from real-life pitch data that batting average varies by count. It's highest on 3-0 pitches, and lowest on 0-2 pitches.

Here's a side-by-side comparison of Baseball Mogul 2013 results with the MLB data from 2006-2010:

Count
   
BABIP
     
AVG
     
OBP
     
SLG
 
3-0
Baseball Mogul .349 .402 .943 .782
MLB .343 .401 .946 .789
 
2-0
Baseball Mogul .297 .335 .337 .613
MLB .299 .343 .343 .622
 
1-0
Baseball Mogul .304 .325 .327 .555
MLB .310 .342 .342 .574
 
0-0
Baseball Mogul .309 .333 .335 .536
MLB .311 .341 .341 .555
 
3-1
Baseball Mogul .305 .339 .666 .597
MLB .307 .347 .681 .604
 
2-1
Baseball Mogul .293 .320 .322 .538
MLB .301 .332 .332 .551
 
1-1
Baseball Mogul .300 .324 .326 .509
MLB .301 .328 .328 .522
 
0-1
Baseball Mogul .291 .311 .314 .472
MLB .299 .321 .321 .487
 
3-2
Baseball Mogul .303 .219 .467 .368
MLB .303 .230 .470 .380
 
2-2
Baseball Mogul .291 .186 .179 .297
MLB .288 .195 .195 .308
 
1-2
Baseball Mogul .274 .177 .160 .246
MLB .282 .177 .177 .260
 
0-2
Baseball Mogul .271 .163 .145 .218
MLB .282 .167 .167 .243

As you can see, it's pretty darn accurate. The biggest difference is that the slugging numbers are lower across the board in the simulation, due to the fact that we expect offense in 2012 to continue to trend lower than it was from 2006 to 2010.

4 comments:

Jesse said...

Awesome... glad to see you're making some big changes here!

Anonymous said...

Great. This game is half way there. The REALISTIC flight of a ball that's been hit is critical as well. The HomeRun is the most exciting play in baseball..... yet no game duplicates accurate flight of the ball accurately 'clearing the fence'---or a triple or banging off the wall. Meh, maybe it's just me.

Anonymous said...

I disagree with the above comment. I feel that in game mode a homerun is almost like in real life "that special sound" but instead, the game has the the "magic speed". What I mean is that in the game simulation mode, i know immediately which hit is a homerun. They are " No-doubters". Just the same I do like how I still lean to the edge of my seat when a player hits a moon shot 397 ft. to the warning track, knowing that it will not be out of the park, but still just hoping. HA I love baseball!

Alex said...

Awesome you said Baseball Games are played on television. Not played in stadiums or ballparks lol.