Showing posts with label Baseball. Show all posts
Showing posts with label Baseball. Show all posts

Tuesday, July 18, 2017

Baseball Mogul: Overall Ratings

As mentioned before, Baseball Mogul used to have the problem that player ratings weren’t well-defined. For example, an 85 Contact rating might put a batter among the top 25 players in the league in this category, while an 85 Power rating might not be enough to break the top 100.

There was also a problem of rating drift. When you start a new game in 2011, the MLB average for Contact is 78; ten years later, it has risen to 82.

These problems were fixed in Baseball Mogul 2016 and further improved in Baseball Mogul 2017. Player ratings are now well-defined in terms of what each number means relative to the league. And ratings are constantly adjusted every to ensure that these definitions remain meaningful over multiple seasons.

However, I’ve gotten some questions about how the ratings are actually defined. So, I pulled a couple tables from my design documents to help clear this up. This one shows how Overall ratings are groups into 10 separate categories, each with its own general definition:


This table shows a typical distribution of players in a 30-team league:


Players in Database:  The number of players that should fall within this range in a modern database with 30 teams.

Roster Spot(s): The roster spots that players of this caliber generally fill on a major league roster. For example, players rated 77-78 will occupy roster spots #16 to #20 on an average team; they aren’t good for the starting lineup or pitching rotation (a total of 13-14 players) but they have value off the bench and out of the bullpen.






Wednesday, July 29, 2015

Offense vs. Defense

Dave Cameron at FanGraphs writes about the Blue Jays adding Tulo to a lineup that already scores the most runs: "There are no diminishing returns to scoring more runs; there is no point on offense to where the marginal value of a run scored is worth less than preventing a run from being allowed on defense."

This is an excellent article with lots of research, so I hate to nitpick. But the above statement isn't completely true. For any team that scores more runs that it allows (which includes pretty much every team that makes the playoffs), preventing a run is more valuable than scoring an additional run.

It's because of the Pythagorean Theorem of Baseball, which states that the ratio of a team's wins to losses corresponds to the ratio of their runs scored to runs allowed (actually, to the SQUARE of these numbers, but that's not essential to this analysis).

Take a team that's on pace to score 600 runs and allow 550 runs. Their ratio of wins to losses should be (600 x 600) : (550 x 550) or 3600:3025. Or, to put it another way, their winning percentage should be:

(600 x 600) / (600 x 600) + (550 x 550) = .543
(that's a record of 88-74 over a 162-game season)

Let's imagine they have a choice to add a hitter that will give them 50 extra runs, or add a pitcher than will prevent 50 runs.

After adding the hitter, they score 650 runs but still allow 550. Their new projected winning percentage is:

(650 x 650) / ((650 x 650) + (550 x 550)) = .583 (94 wins)

After adding the pitcher, they still score 600 runs but now they allow only 500. Their new projected winning percentage is:

(600 x 600) / ((600 x 600) + (500 x 500)) = .590 (96 wins).

That's a difference of two wins. This may not sound like a lot, but the difference between making the playoffs and going home has averaged just 1.5 games in the American League over the last 4 seasons.

The interesting thing about this fact is that it doesn't matter if you are a great offensive team or a mediocre one. As long as you are a good team, one that scores more runs than it allows, it's always more valuable to prevent a run than to score a run.

And ... for fans of the Red Sox or Phillies ... the reverse is also true. If you are allowing more runs than you are scoring, you will improve your team more by adding offense than by adding the equivalent amount of defense.

Thursday, October 2, 2014

"Failed Fielder's Choice"

Looking for some feedback on official scoring. Imagine the following:

Play #1:
No outs. Lorenzo Cain on 1B. Eric Hosmer batting.
Jed Lowrie fields a ground ball in the hole and appears to have enough time to get Hosmer out at 1B.
Instead, Lowrie throws to 2B to successfully force out the lead runner.

The above is a "fielder's choice". Section 10.00 of the MLB rules is pretty clear about how the above situation is scored.

Play #2:
No outs. Lorenzo Cain on 1B. Eric Hosmer batting.
Jed Lowrie fields a ground ball in the hole and appears to have enough time to get Hosmer out at 1B.
Instead, Lowrie throws to 2B in an attempt to keep the runner out of scoring position. But the throw is late and everybody is safe.

I often refer to this as a "failed fielder's choice" to avoid confusion with the result of Play #1.

Play #2 is also a fielder's choice because Section 2.00 states: "FIELDER'S CHOICE is the act of a fielder who handles a fair grounder and, instead of throwing to first base to put out the batter-runner, throws to another base in an attempt to put out a preceding runner."

It also can't be recorded as a hit. Rule 10.05(b)(4): "The official scorer shall not credit a base hit when a ... fielder fails in an attempt to put out a preceding runner and, in the scorer's judgment, the batter-runner could have been put out at first base"

Therefore, I believe the following to be true:

1) The shortstop is NOT charged with an error.
2) The batter is credited with an at-bat.
3) The batter is NOT credited with a hit.
4) The pitcher is credited with a batter faced (and an "opponent at bat", such as for calculating "opponent batting average").
5) The pitcher is not credited with a "hit allowed".
6) The pitcher IS credited with a "ground ball out" (as used in the calculation of "GO/AO").

(I realize some of these aren't official stats, but I'm hoping to find some agreement about non-official stats.)

However, imagine the following:

Play #3:
No outs. Bases empty. Jeff Samardzija walks Lorenzo Cain.
Fernando Abad relieves Samardzija.
Eric Hosmer batting.
Jed Lowrie fields a ground ball in the hole and appears to have enough time to get Hosmer out at 1B.
Lowrie throws to 2B in an attempt to keep the runner out of scoring position. But the throw is late and everybody is safe.
Billy Butler hits a 3-run homer.

My interpretation:
Cain is charged to Samardzija as an earned run.
Hosmer and Butler are charged to Abad as earned runs.
(If Cain had been successfully forced out, Hosmer would be charged to Samardzija.)

So ... Abad gets credit for a ground out, and then gets tagged with an earned run for the guy who "grounded out".

Is this correct?

Thanks!

Clay

Thursday, May 29, 2014

DICE: Defensive Independent Component ERA

I'm reposting an article from July 2000, because Baseball Mogul players keep asking me what 'DICE' stands for on the pitcher Scouting Report (and because the text in the original article is tiny and hard to read).

Defense Independent Component ERA

July 19, 2000

If you play Baseball Mogul, you have already encountered Defense Independent Component ERA ("DICE"), even though you don't realize it. This is because the artificial intelligence in Baseball Mogul uses DICE to evaluate pitching talent.


We also use it at Sports Mogul to create our annual player projections.

DICE starts with the concept of "Component ERA" invented by Bill James. The concept is pretty simple -- use the components of a pitcher's statistical performance (such as hits allowed and hit batters) to predict a pitcher's ERA. Because there is a strong correlation between these individual events and the pitcher's ERA, you can actually estimate a pitcher's ERA in a season by just looking at the components. In other words, you can predict earned runs allowed by looking at the individual events (such as walks and home runs) that led to the runs themselves.

ERA is a somewhat luck-based stat. One season is a relatively small sample size, and earned runs given up in one season may not be a true indicator of the pitcher's overall ability level. The pitcher might have given up several home runs with the bases loaded, causing his ERA to be higher than it would have been if the home runs had been distributed randomly throughout the season.

By deriving a value from hits, walks, hit batters and home runs, Component ERA attempts to be a better evaluator of a pitcher's true ability to prevent runs.

Here is James' formula for Component ERA (CERA):

CERA=(((H+BB+HBP)*(.89*(1.255*H+2.745*HR)+.56*(BB+HBP-IBB)))/(BFP*IP))*9-.56

But there are a few problems with CERA:

The biggest is that it includes hits. Hits aren't a great indicator of a pitcher's true pitching ability. With the exception of home runs, the number of hits allowed by any pitcher are largely affected by the quality of the defense behind him. This makes sense, but it also stands up to statistical analysis. A pitcher's Strikeout Ratio (strikeouts pitched per 9 innings) is relatively consistent from year to year. However, a pitcher's Hit-Out Ratio (ratio of hits to outs, after removing strikeouts and homeruns) doesn't have the same consistency.

The second problem I have with CERA is that it's tough to calculate. Although they aren't perfect, I like measures such as Slugging Percentage and Total Average with formulae that are pretty easy to remember.

So, I created a slightly different form of Component ERA called "Defensive Independent Component ERA" (or DICE) that uses the variables in Component ERA, but removes hits (but leaves in Home Runs -- because these are almost never affected by defense).

At first, it looked something like this:

DICE = x + (y*(BB + HBP) + z*HR) / IP

Using all active pitchers in 1999 with 500 or more career Innings Pitched, I performed a regression on the above function to determine the constants x, y and z such that DICE best predicted their career average ERA. (There were 229 pitchers in this data set).

But after some experimenting, I noticed that ERAs were also strongly correlated with strikeouts, even when the other stats (walks, hit batters, and home runs) were already taken into account. As strikeouts are also defense-independent, it makes sense to add them to the formula. This is somewhat counter-intuitive. After all, a ground out can be just as good as a strikeout to end an inning. But the regression doesn't lie -- strikeouts are more effective than other types of outs at reducing earned runs. Or more accurately, strikeout numbers are useful in predicting a pitcher's ERA.

So I added strikeouts to the formula and performed another regression to determine the correct coefficients to use in the formula. Finally, I found the integer coefficients that best matched the data (because integers make the math easier than that required for CERA):

DICE = 3 + (3*(BB + HBP) + 13*HR - 2*K) / IP

(The Mean Squared Error for this formula, across all 229 pitchers, is .100697. The Square Root of the Mean Squared Error is about .317 -- meaning that about 2/3 of all actual ERA values should fall with .317 runs of a pitchers DICE value)

So there you have it:
1. Start with a value of 3 times the number of walks and hit batters
2. Add 13 for every home run allowed
3. Subtract 2 for every strikeout
4. Divide this total by the number of innings pitched
5. Finally, add this result to 3.00 to get the pitcher's Defense-Independent Component ERA (aka DICE).

Here's an example using Roger Clemens 1998 season (his most recent Cy Young Award):

DICE = 3.00 + (3 * (68 BB + 7 HB) + 13 * 9 HR - 2 * 292 K) / 264 IP = 2.14
Roger's actual ERA in 1998 was 2.05

Anyway, I first developed this stat to help me predict how a pitcher would perform in my rotisserie league. DICE is a better predictor of a pitcher's ERA in the upcoming year than any other stat I could find (such as his previous year's actual ERA). Using these predictions, I was able to win the league for 4 years out of 6 (and I'm currently in 1st place in year 7). And of course DICE is one of many tools we use inside the Baseball Mogul game engine.

Thursday, April 17, 2014

Creating an Expansion Team (Baseball Mogul)

This year's version of Baseball Mogul has a new feature: the ability to create a new expansion team and build it from players on current teams, using the current MLB rules for conducting expansion drafts.


When you choose "Expansion" on the New Game Dialog (above), there is a new button in the lower right with a question mark on it:


Clicking this button takes you to the Create Expansion Team dialog, where you select the city for your expansion team, and specify a team name, stadium name and division.

Note: The team name automatically includes your city name. For example, the above team will be called the "Las Vegas Dealers". You can use the League Editor to change this later (such as to the "Nevada Dealers").
After you click "Play", Baseball Mogul will automatically create a 2nd expansion team to ensure that there are an even number of teams. The computer analyzes the current city data and picks a city at random from among the best options.

Is it fair to assume that the "Oregon Cavemen" play in "GEICO Park"?
Baseball Mogul then hands you control of your team, at the beginning of the expansion draft.


Each of the existing teams is allowed to protect 15 players in the first round of the expansion draft, and 3 additional players in each additional round (and players drafted in the last 3 seasons, such as Manny Machado, do not need to be protected). For example, in the following list of third basemen, we see that the Yankees left A-Rod unprotected, because they would love an expansion team to take his contract off their hands:


At any point, you can use the Play Menu to let the computer complete the expansion draft for you. And, after the expansion draft is complete, you can still grab unsigned players to fill out your major-league and minor-league rosters.

Thursday, February 13, 2014

Baseball Mogul 2015 Pre-Order



We are now accepting pre-orders for Baseball Mogul 2015, which will go on sale on April 11th.

This pre-order option is limited to only 20 customers, and can be locked down by pledging $25 at the Kickstarter campaign for Masters of the Gridiron.

I admit that it's weird to use Kickstarter for Baseball Mogul, a product that is now in it's 17th version. We have always wanted to allow pre-orders of Baseball Mogul through our normal ordering system, but we aren't allowed to process a payment unless we immediately ship the product.

So, here we are, selling Baseball Mogul 2015 on Kickstarter at a whopping 28% discount, and you won't even be billed until after the Kickstarter campaign ends.

(Note also that we have added an option to pre-order both Baseball Mogul 2015 and Masters of Gridiron for $47, including free shipping inside the United States).

Tuesday, March 5, 2013

Baseball Mogul 2014: Minor League Park Factors

Park factors used for calculating 2013 Major League Equivalencies (MLEs) for minor league and major league players included in Baseball Mogul 2014. (.pdf version)

Five-year park factors generated using 2008-2012 minor league player data.

California League (Class A Advanced)

Team
State
Farm System
H
2B
3B
HR
BB
K
R
Bakersfield Blaze
CA
CIN
0.98
1.00
0.93
1.05
1.03
0.99
0.98
High Desert Mavericks
CA
SEA
1.05
1.03
0.85
1.23
1.03
0.94
1.13
Inland Empire 66ers
CA
LAA
0.98
0.96
1.15
0.79
0.99
1.03
0.93
Lake Elsinore Storm
CA
SD
0.96
1.01
1.13
0.84
1.02
0.97
0.95
Lancaster JetHawks
CA
HOU
1.08
1.04
0.91
1.21
1.01
0.96
1.14
Modesto Nuts
CA
COL
1.00
1.06
1.26
0.80
0.99
0.99
0.97
Rancho Cucamonga Quakes
CA
LAD
0.99
0.99
1.02
0.96
0.97
0.99
0.96
San Jose Giants
CA
SF
0.94
0.97
1.04
0.87
0.97
1.08
0.89
Stockton Ports
CA
OAK
0.98
0.95
0.71
1.22
1.02
1.04
1.00
Visalia Rawhide
CA
ARI
1.03
1.04
0.88
1.19
1.01
0.98
1.06

Thursday, February 28, 2013

Real Minor League Stats (Baseball Mogul)

For Baseball Mogul 2014, we have added over 1 million lines of minor league batting, pitching, fielding and catching data, going back to the 1880s.

Click for larger image
The above screen shot shows Jurickson Profar, the Ranger's #1 prospect. The Baseball Mogul AI thinks he should be in their starting lineup on Opening Day.

Sunday, February 24, 2013

Baseball Mogul 2014: Fixing The Game (Part 2 of 2)

As mentioned last week, we have been reworking the General Manager AI so that computer-controlled teams are much more intelligent when it comes to trades and roster management. This also means that the computer-controlled teams won't have to "cheat" on high difficulty levels just to provide a challenge. But the other weakness that gets abused is the player rating system.

Player Ratings


One solution to the problem of accurate player ratings is to make them less accurate. This works, but leads to some really dumb results, like your scouts telling you that Ubaldo Jiminez has a "95" Control rating.

The other option is to simply turn ratings off. Baseball Mogul lets you do this, and it's actually a good option in my opinion. After all, the game is much more realistic when you turn off the ability to see player ratings. Nobody in Major League Baseball has a crystal ball regarding player abilities. You can't just give your scouts more money and magically gain access to the "true" ability level of every player in baseball. If a GM could view the Strat-O-Matic cards for every single player in his organization, the job of talent assessment would be pretty boring.

Strat-O-Matic Cards

Thursday, February 21, 2013

Baseball Mogul 2014: Team Colors


(Click for larger version)
Just posting a screenshot of the new Uniform Designer for Baseball Mogul 2014 (Release Date: March 22nd, 2013).

As you can see, you can specify the color of various parts of the home and away uniform. Each of those buttons is a "color picker" that lets you choose any color in the spectrum. You can even assign pinstripe colors (or choose no pinstripes).

Each team has 4 alternate uniforms, and the option to choose which days of the week (if any) they are used. Mogul currently loads the 2013 uniforms (and any official alternate uniforms) automatically. A database of uniforms for the last 110 years will be included with the final version.

These uniforms are used for all the batter and pitcher animations, and combined with the correct skin color for the player in question.

Saturday, February 9, 2013

Baseball Mogul 2014: New Uniforms

A very cool feature this year is new animations for the batter and pitcher. I realize that Baseball Mogul isn't about cutting-edge graphics. But it was pretty annoying that every single batter and pitcher was a white guy with bad posture and a dark blue helmet.

Old Animations
In case you were wondering, the old batter animation (and pitcher animation) was me. Filmed in my backyard, with a giant green tarp used as a green screen. There was snow on the ground.

So, this year features a new animation system, with the ability to change uniform colors and skin colors.

Examples of New Animations
The cool thing is that this feature is being added by an artist and another programmer, with very little effort on my part. So I get to keep working on the meat of the game.

Friday, February 1, 2013

The Skin Color Project

Over the years, I have heard baseball researchers ask if there was a "race database" somewhere.

For example. the National League was better than the American League throughout the 1960s and 1970s. From 1960 to 1982, the National League won 23 All-Star Games. The American League only won 2.

I would surmise that the National League dominated because they integrated more quickly, adding the best black players to their teams while a number of American League teams (most notably, the Red Sox) continued to stay all-white.

But I haven't seen data supporting that theory. That's because we have stats going back to the 1870s, but we don't have a database of racial/ethnic categorization. One reason is that every time it comes up, it becomes clear that "race" is a cultural phenomenon, not a scientific one. For example, President Obama is (at least) half-white and yet we call him the first "black" president.

So, forget "race". I'm working on a skin color database, ranking everyone from 1-9 like so:


Tuesday, November 27, 2012

Marvin Miller Dies at 95

Marvin Miller (1917-2012)

Marvin Miller, the father of free agency in baseball, died today at the age of 95. For those unfamiliar with baseball's labor history, I'm including below the history of baseball labor relations that we published 7 years ago (with some updates as part of our efforts to provide more realistic historical labor relations in Baseball Mogul 2014).

Friday, September 14, 2012

The Worst Front Office Move Ever?

Babe Ruth (1918)
Babe Ruth (1918)
It's not hard for most people to come up with a list of "dumb front office moves" in the history of baseball. Here are 10 off the top of my head:
  • #10. Philadelphia Phillies trade Larry Bowa and Ryne Sandberg to the Chicago Cubs for Ivan DeJesus. (1982)
  • #9. The Cincinnati Reds trade Frank Robinson to the Baltimore Orioles for Milt Pappas. (1965)