Think about learning Bayes using Python

An interview with Allen Downey, the author of Think Bayes

Allen Downey

Allen Downey

When Mike first discussed Allen Downey’s Think Bayes book project with me, I remember nodding a lot. As the data editor, I spend a lot of time thinking about the different people within our Strata audience and how we can provide what I refer to “bridge resources”. We need to know and understand the environments that our users are the most comfortable in and provide them with the appropriate bridges in order to learn a new technique, language, tool, or …even math.  I’ve also been very clear that almost everyone will need to improve their math skills should they decide to pursue a career in data science. So when Mike mentioned that Allen’s approach was to teach math not using math…but using Python, I immediately indicated my support for the project. Once the book was written, I contacted Allen about an interview and he graciously took some time away from the start of the semester to answer a few questions about his approach, teaching, and writing.

How did the “Think” series come about? What led you to start the series?

Allen Downey: A lot of it comes from my experience teaching at Olin College. All of our students take a basic programming class in the first semester, and I discovered that I could use their programming skills as a pedagogic wedge. What I mean is if you know how to program, you can use that skill to learn everything else.

I started with Think Stats because statistics is an area that has really suffered from the mathematical approach. At a lot of colleges, students take a mathematical statistics class that really doesn’t prepare them to work with real data. By taking a computational approach I was able to explain things more clearly (at least I think so). And more importantly, the computational approach lets students dive in and work with real data right away.

At this point there are four books in the series and I’m working on the fifth. Think Python covers Python programming–it’s the prerequisite for all the other books. But once you’ve got basic Python skills, you can read the others in any order.

How does your most recent book, “Think Bayes”, fit into the series?


Image Provided Courtesy of Olin College

Allen Downey: While I was working on Think Stats, I realized that there was an opportunity to present Bayesian statistics clearly and simply by using Python instead of the usual mathematics.

I started writing about Bayesian statistics in my blog, and the response was huge. My most popular article, called “All your Bayes are belong to us,” has more than 25,000 page views.

Then in 2012 I taught a tutorial at PyCon called “Bayesian Statistics made simple.” It sold out very quickly, and again in 2013, so I knew there was a lot of interest.

By the time I started writing Think Bayes, I had been writing and teaching about Bayesian statistics for a few years. As soon as I wrote a chapter, I made it available on the web, so I got a lot of feedback from readers. I remember there was an example in Chapter 1, the “Girl Named Florida problem,” that no one liked. Half of the readers found it confusing; the other half thought I was wrong. I realized that it wasn’t necessary for the point I was trying to make, so I just took it out.

I think this process, which is basically the “release early, release often” process from open source software, works very well for books. Think Bayes is a better book because of it.

Do you use the exercises from “Think Bayes” in your courses at Olin College? How have students connected with the material?

Image Provided Courtesy of Olin College

Image Provided Courtesy of Olin College

Allen Downey: Yes, whenever I am working on a book, I get students involved. One of the great things about Olin College is that we are supposed to innovate, so any time I propose a crazy new class, the college says “yes,” and students are willing to sign up for it. Most of my colleagues at other schools don’t have those kind of opportunities.

This past semester (Spring 2013) I taught a class based on a draft of Think Bayes. I had 10 students who read the book, worked on exercises, and then developed their own case studies. There was a project on tracking whales, another on Bayesian analysis of poker, and one on estimating the size of a fire. I ended up using one of the case studies–predicting the arrival time of a subway train — as a chapter in the book.

This semester I am working on a new book about Digital Signal Processing, and I am recruiting a group of students to work on it with me.

As you teach, write books, and speak at events such as PyCon and Strata, do you have any advice for those who are looking to pursue a similar career path?

Allen Downey: I think the most important thing is to know who you are writing for. As early as possible I try to connect with an audience and get feedback. When I start a new book, I post chapters in my blog, I get students involved, and I give talks. The Boston Python User Group has been a great resource for me. I go to project nights there, I’ve done some talks, and when I was preparing my PyCon Tutorial, I did a trial run for them.

The other thing I think about all the time is how students can apply new knowledge — what they can do with it. Sometimes teachers focus too much on covering material and not enough on what the students can do. I like to start with an application and work backward. Maybe I have experiment I want the students to run or a problem they should solve. I ask myself, “What’s the minimum knowledge a student needs to solve this problem, and what’s the sequence of steps that presents it most clearly?”

Again, the computational approach makes it easy because if I present a new function or a new class, students can run the code themselves, modify it, experiment with it–and then they can build on it. The solution to each problem becomes a building block for the solution to the next problem.

My goal is that readers should be able to apply what they learn in new contexts to solve real-world problems.

Editor’s Note: This interview has been edited.

Related Resources

O’Reilly Strata Conference — Strata brings together the leading minds in data science and big data — decision makers and practitioners driving the future of their businesses and technologies. Get the skills, tools, and strategies you need to make data work.


Strata Rx Health Data Conference: September 25-27 | Boston, MA
Strata + Hadoop World: October 28-30 | New York, NY
Strata in London: November 15-17 | London, England

tags: , ,