Free Downloads vs. Sales: A Publishing Case Study

Asterisk book coverAs part of our continued effort to understand the impact on book sales of the availability of free downloads, I wanted to share some data on downloads versus sales of the book Asterisk: The Future of Telephony, by Leif Madsen, Jared Smith, and Jim Van Meggelen, which was released for free download under a Creative Commons license.

Jeremy McNamara of, which operates one of the mirrors, provided us with download stats, which we were then able to compare with book sales. Our goal of course, is to help publishers understand whether free downloads help or hurt sales. The quick answer from this experiment is that we saw no definitive correlation, but there is little sign that the free downloads hurt sales. More than 180,000 copies were downloaded from Jeremy’s mirror (which is one of five!), yet the book has still been quite successful, selling almost 19,000 copies in a year and a half. This is quite good for a technical book these days — the book comes in at #23 on our lifetime-to-date sales list for the “class of 2005” (books published in 2005) despite being released at the end of September. You might argue that the book would have done even better without the downloads, especially given the success of asterisk and the importance of VoIP. But it’s also the case that the book is far and away the bestseller in the category, far outperforming books on the same subject from other publishers.

Meanwhile, we saw a huge spike in downloads starting at the beginning of this year, but didn’t see a corresponding drop in print book sales, other than the continued slow erosion that’s typical of books in print (especially one that’s heading towards a second edition.) However, we did see the book’s first fall from grace, dropping from an average run rate of about a thousand copies a month to about six hundred back in March 2006 coming at about the same time that we start showing the free downloads, but we’re not sure whether or not that is just because we don’t have earlier download data — we believe that the book was available online sooner after publication even though Jeremy didn’t start his mirror till March. (Next time we do a book available for free download, we’ll be careful to collect accurate data from the start of the project.)

In any case, this kind of sales drop is not completely inconsistent with the sales pattern from many other books. And for authors who want to reach the widest audience, it’s certainly possible that even if free downloads did shave a percentage from sales, the tradeoff is worth it (see Piracy is Progressive Taxation).

Here’s the graph comparing downloads to print book sales:


Because the scale of the free downloads is so much greater than the scale of the book sales, the data is plotted on a two-axis graph. The y-axis on the left shows the book sales numbers, while the y-axis on the right shows the download numbers. As you can see, the book peaked in its sales shortly after its release and then started on a gradual downward trend. Unfortunately, we don’t have download data from the very start of the project, but only when Jeremy’s mirror started seven months later. But what’s most striking (apart from the huge scale mismatch, in terms of the number of people accessing the content through the free online version), is that when the downloads spiked in January of this year from about 8000 a month to nearly 30,000 after the book’s free availability was noted on digg, we didn’t see a correspondingly sharp decline in sales. Of course, neither did we see any evidence that free availability of the book spurred sales. And as noted above, there is a sharp drop at about the time the download data starts that is likely unrelated to the downloads, even though we can’t entirely rule out the possibility that downloads had some effect.

Keep reading for a few more details, plus graphs showing the relationship between book sell-in and sell-through, and the sales pattern for a comparable non-free book.

A few notes on the data:

  • Book sales data is taken from Bookscan. Bookscan reports data weekly, and as you all know, months don’t end on neat weekly boundaries, while Jeremy’s download data is on a monthly calendar. I’ve chosen the closest week end, so some months have five weeks, while others have four — that is one reason why the book data spikes up and down.
  • Bookscan claims to report about 70% of US book sales. We estimate that this represents about 50% of worldwide English-language sales. As a result, I doubled the reported Bookscan numbers for purposes of this graph. The result is consistent with the inventory data. We’ve sold in about 19,000 copies, net of returns. Doubling Bookscan sell-through yields about 17,000 sold through, which would suggest that there are a few thousand left in inventory in bookstores. (Our actual data shows fewer than a thousand, so the right multiplier might be 2.1, but it’s close enough.)
  • In the paragraph above, I referred to sell-in and sell-through. A reminder: Bookstores typically stock up when a book is first released, and then sell down that inventory over time. The publisher’s sales to the bookstores are “sell-in”, the numbers reported by Bookscan are the bookstores’ sales to end customers, or sell-through. Here’s a graph comparing sell-in of the book to its sell-through:
  • asterisksellin.png

As you can see, the initial sell-in was around 5000 copies. After the initial sales pattern was established, bookstores order just enough to keep up with demand. However, this can sometimes be a self-fulfilling prophecy. If there aren’t enough copies on the shelf, they can’t be discovered by potential readers browsing the store. That may well be one reason for the sales decline. Bookstores aren’t carrying many copies of this book for people to discover. It’s got a negligible return rate, yet there are very few copies in bookstores — we show only 200 for all of Barnes & Noble’s hundreds of stores, and 400 for all Borders stores. (Amazon is pretty much just-in-time, and doesn’t need to carry much inventory.)

Finally, I wanted to show the Bookscan trend graph for another O’Reilly book released about the same time, Understanding the Linux Kernel. You can see the same early spike in sales, and the same long, gradual decline. (I will admit that the decline in this book has been more gradual, and it’s achieved a bit more of a steady state than the asterisk book.)


P.S. If this kind of information floats your boat, TOC is the place to be. We’re obviously very involved in the changes the internet is bringing to publishing, and are bringing together people who are driving those changes, rather than just waiting for them.