Understanding randomness is a double-edged sword

A review of "The Drunkard's Walk: How Randomness Rules Our Lives."

The Drunkard's Walk coverLeonard Mlodinow’s “The Drunkard’s Walk: How Randomness Rules our Lives” is a great book on an important subject. As data scientists know, random phenomenon are everywhere, and humans don’t understand them well. We’re not wired to understand them well. This book is a huge help, and will be a relief to anyone who’s heard people say “I don’t believe in global warming because last winter we got a lot of snow,” or some load of crap like that. The book is well written, there’s a lot of storytelling, and the storytelling is fun and interesting. Along the way Mlodinow gives coherent explanations of Bayes’ theorem, the Monty Hall problem (offering the simplest correct explanation I’ve ever seen), the origins of statistics and more. If you want an excellent non-mathematical introduction to probabilistic thinking, this is the book to get. (If you want the mathematics, this book studiously avoids equations. Get William Feller’s “An Introduction to Probability Theory and Its Applications” for the deeper material.)

But there’s always a but. But, but, but …

I have two problems with “The Drunkard’s Walk.” They’ve been nagging me ever since I finished.

First, Mlodinow spends a lot of time debunking the notion of “hot streaks.” He’s right, and that’s important: most hot streaks in sports and elsewhere can be adequately explained by randomness. Randomness is inherently streaky and clumpy; it’s not just a smooth gray. In fact, if you get something that looks smooth and “random,” it’s almost certainly not random. So far, so good. But — when he moves from Roger Maris’ record-breaking season to portfolio managers picking hot stocks, there’s a fundamental asymmetry.

With Maris, the author starts with the long-term batting average. We’re not just “flipping coins”; we’re flipping a weighted coin, a coin that happens to land with the “home run” side facing up a lot more frequently than it would if I were in the batter’s box. That’s all well and good. If I faced a season’s worth of professional baseball pitching, I daresay I wouldn’t get a single hit, let alone any home runs. But — and this is important — he doesn’t do the same for the stock pickers, book acquisition editors, or Hollywood movie execs that he talks about. For them, it’s just flipping coins. And it’s one thing to say that, if you just flipped coins for 10 years, you’d have a 75% chance of duplicating a great financial manager’s performance over some five-year period. It’s another thing to imply that the manager’s performance is just a matter of luck, not skill. Yes, there is a lot of luck involved, but where’s the notion of baseline performance, of long term success or failure, that was the starting point for analyzing Maris’ hot year? Maris’ hot year may have been a random phenomenon, but it was a random phenomenon in the context of five years hitting more than 20 home runs per season, during which his cumulative batting average was somewhere around .271. What’s the stock picker’s cumulative batting average? Who are the other financial analysts working at the same level? We never find out. And that’s a big part of the story to omit.

Second, Mlodinow frequently forgets one of the most important aspects of the mathematical study of random processes. When we’re talking probability and statistics, we’re talking about interchangeable events. It’s easy to forget this, but as Mlodinow himself points out, there are many, many ways to make important mistakes when you’re talking about probability. The important thing about urns with black and white balls is that the balls are the same. (If you don’t know about urns, take a probability course or read the book; they’re baked into the history of probability theory.) If some of the balls were ovals and some were star-shaped, these probability experiments wouldn’t work.

So, back again to the stock pickers, the acquisitions editors, and the Hollywood execs. We agree at some level that all at-bats in baseball are equivalent. This is, of course, an idealization, but it’s one we’re fairly comfortable with. But all stocks are not the same, all books are not the same, and all movies are not the same. They may be the same within a certain class (energy stocks, cheap romance novels, spy movies). A stock analyst who’s good with financials may have nothing to say about manufacturing. But at the high end of the spectrum (literary novels, fine wines, art movies), everything is unique, precisely in a way that Harlequin romances aren’t. Probability and statistics are still powerful tools, but you have to be very careful about how you apply them.

Since I’m in the publishing business, I’m particularly annoyed by the story of an editor who, in an experiment, was given a typewritten chapter of a V. S. Naipaul novel that had won a major award. She rejected it. I’m not a fan of Naipaul, so I’m sympathetic. But is that evidence of her editorial skill (or lack thereof), or of random processes? Since we’re now in a world where every event is unique we have to ask more questions: What publisher was she working for? Grove Press, which publishes top drawer literary fiction with a tendency toward the avant garde (for whom Naipaul might have been too stodgy)? Or Bantam, which specializes in lightweight beach-side reading? In both cases, a rejection would have been perfectly appropriate. Probability aside, it’s a cheap shot to say: “Because this book won a major award, we’d expect editors at a publishing company to accept it. If they don’t, that’s evidence that publishing is a random process.”

Publishing (and movies, and wines, and maybe even stocks) are a different world, and the disagreements are precisely what is important. Modeling disagreement as random fluctuation isn’t doing anyone a service. I may dislike Naipaul’s fiction, but I hardly see that as a random result. We could ask about the conditional probability that an English major will dislike Naipaul, given that the English major plays piano, has a strong background in electrical engineering and mathematics, and likes Salman Rushdie, and use that to come up with some sort of number. But I’d have no idea what that number means. We’re not picking black and white balls out of urns here — or if we are, the balls are of different shapes and sizes.

Am I just going back to the human tendency to build stories where there is nothing but randomness? Am I just refusing to deal with the stark realities of random phenomenon that surround us everywhere? Perhaps. Then again, that’s what makes us human. And in the many situations where probability and statistics aren’t appropriate tools, such as picking books or movies, then all we have to fall back on is our ability to make stories, our ability to make sense. Where “make” is precisely the most important word in that last sentence.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Postscript

There’s an important, but subtle, distinction to be made between events that can be modelled by random processes and events that are actually random processes. Mlodinow makes this same distinction in his discussion of Maris. At one point, he grants that a hot or cold streak could be the result of changes in exercise, eating habits, personal stress, or any number of non-random factors; but since we can’t account for these, he rolls them up into randomness. So, essentially he’s saying that a hot streak can be modelled as a random process, though it may have an underlying cause that isn’t random at all. Say Maris signed a contract to appear on the front of Wheaties boxes, and decided that he might as well eat the stuff. And say that eating Wheaties actually did increase his slugging percentage significantly. If so, betting heavily on Maris during a hot streak might not be such a bad idea, since he’s not just hitting well because he happens to be lucky. And if so, I would still bet heavily that Maris’ record-breaking year could be modelled as a random process. After all, probability and statistics are very blunt instruments.

Rolling up potentially non-random factors that can’t be measured into “randomness” is a common trick, and reasonably acceptable. You can’t analyze what you don’t know. But it’s a trick that worries me. Let’s take a situation that I think is similar, but with much more profound consequences. A decade or so ago, it was well-known that Tamoxifen was a useful drug against breast cancer, effective in roughly 80% of all cases. That’s equivalent to saying that Tamoxifen has an .800 batting average. You could model Tamoxifen’s success by flipping a coin that came up heads 80% of the time.

But more recent research has revealed that Tamoxifen’s story isn’t random; at least, not random in that way. It’s successful almost 100% of the time on patients with certain genetics and almost 0% of the time on other patients. In other words, the randomness is in the stream of incoming patients, not the effects of the drug. That discovery has a huge practical effect on breast cancer treatment. You can do tests to figure out whether treating a patient with Tamoxifen is likely to be successful, or a waste of time. You can also look in a more focused way for treatments that will be effective on the remaining 20%. Even more important: It’s my belief that the next generation of medicine will be “personalized.” Rather than using drugs that have been successful in broad clinical trials involving thousands of patients, we’ll be focusing on drugs that are tuned to an individual’s genetic makeup. Is it possible that the drug that would be effective on the 20% of women who don’t respond to Tamoxifen has already been discovered and discarded, because its success rate wasn’t statistically significant? Is it possible that there’s a drug that’s 100% effective on only 5%? Or 1%? What methods will we use to evaluate the performance of these drugs?

Understanding randomness is a double-edged sword. Humans are built to create patterns, even when there’s nothing going on but random phenomena. Granted, that’s an extremely important story, and Mlodinow does an excellent job of telling it. At the same time, we are wired to create stories, and can’t afford to let randomness stop us from doing so, particularly when a story that gives a richer understanding of the data is just beyond our grasp. Understanding what is random and what is not (or, more precisely stated, understanding what parts of any processes are really random) is the key. While humans are all too willing to grasp at the straws of a story when there’s no story there (just go to any casino), we can also throw out the stories we haven’t yet finished because we’re convinced there’s nothing there. And that’s a tragedy.

tags: , ,