Baseball Simulations
How likely are the world records we hold dear? Should they have happened? Should they been set by the people who did them? There's an New York Times Opinion piece written by some researchers who examined Joe DiMaggio's 56 game hitting streak to determine how likely it was to happen again. Turns out it's very likely.
In the 10, 000 simulations the researchers ran on the entire history of baseball:
More than half the time, or in 5,295 baseball universes, the record for the longest hitting streak exceeded 53 games. Two-thirds of the time, the best streak was between 50 and 64 games.
In other words, streaks of 56 games or longer are not at all an unusual occurrence. Forty-two percent of the simulated baseball histories have a streak of DiMaggio’s length or longer. You shouldn’t be too surprised that someone, at some time in the history of the game, accomplished what DiMaggio did.
The real surprise is when the record was set. Our analysis reveals that 1941 was one of the least likely seasons for such an epic streak to occur.
In the rest of the article they discuss the other people more likely (based on the simulations) to have made the streak.
tags:
| comments: 6
| Sphere It
submit:
Subscribe to Comments on this Entry:
0 TrackBacks
TrackBack URL for this entry: http://radar.oreilly.com/mt/mt-tb.cgi/10024
Comments: 6
[03.31.08 02:41 PM]
Perhaps we celebrate DiMaggio for a different reason.
Statistics capture some of the dynamics of the situation but they do not tell the story. Ok, some record was inevitable but.... why was it Joe?
Perhaps he knew a thing or two that helped him do that.
Nothing is accidental.
-t
[03.31.08 02:53 PM]
A must read os the subject is Part III of Stephan Jay Gould's Full House.
[03.31.08 03:02 PM]
Sorry: STEPHEN Jay Gould.
[03.31.08 03:38 PM]
If its so common why hasn't it happened in over 50 years????
[04.01.08 10:51 PM]
I can't imagine that a simple statistical model would be all that helpful; certainly there're a lot of other factors (didn't Yogi Berra say that baseball is 90% mental, and the other half is physical?), and I could guarantee, as a manager, that I could break any streak: just have the pitcher(s) walk the guy at every at bat.
[04.02.08 07:18 AM]
The problem with statistical simulation models is that they start by assuming that x = 1, then go about the whole algorithm to end up with the learning that in most cases, x = 1 or thereabouts. And I say this from professional experience in financial modelling.












