Gentlemen Prefer PDFs

One of the interesting outcomes of our Rough Cuts early access program is some great data on the strong preference of our customers for downloadable PDFs over print books. Based on a little less than 3 months of data, we see that of the customers who’ve bought Rough Cuts, 60% chose the PDF-only option; 36% chose the bundle of PDF plus print book, and only 4% chose to pre-order the print book only.


These numbers are a perfect match with those reported by the Pragmatic Programmers. Dave Thomas told me in email (and gave permission to post) that in the first quarter, more than 60% of their direct sales were PDFs rather than print books, another 35% was a bundle of print and PDF, and 5% were print only. In other words, when given the choice, 95% of customers want the PDF!

This information is also consistent with anecdotal data from alpha geeks like our own Marc Hedlund, who wrote recently on an O’Reilly back channel list:

“I’m basically up to my nostrils in code for my new company, and I’ve noticed a very significant change in the way I’m learning about new technologies. There are three tools I’m using almost exclusively to learn (listed in the order I try them): (1) PDF copies of books on the topics I’m learning; (2) mailing lists, and their archives, on those topics; and (3) source code, either from a source repository or Google. (A distant fourth would be project wikis covering the same topics — the wikis I see are a total mess, out of date, and often seemingly were never right in the first place.) With the exception of system administration topics, where step-by-step instructions matter more to me than real understanding, and Head First/Head Rush books, I’m not using print books at all.

PDFs give me the following benefits which print books lack:

  • They are searchable. I don’t have to rely on the index put together by the publisher — and that’s good, because when I do fall back to the index, it’s not useful to me, no matter who publishes the book. Searching a PDF is a huge speed-up over finding something by TOC or index.
  • They are portable. I check all of my PDFs into source control, and without even trying I have them on all the machines where I develop, whether those machines are online at the moment or not. I don’t carry anything to or from work anymore — if it isn’t in svn, I don’t need it.
  • They are more timely, and often, I can get them in the same hour I find out about them. If the publisher is revising the PDF, either through a beta program or through a new release, I can often get a new copy of the book very quickly and sometimes for free. With downloads, I can get a cheaper copy of the book immediately, rather than paying Amazon a bunch for overnight delivery of a more expensive print book.”

The message is loud and clear. There are a number of important implications, however:

  1. As readers choose PDF only, the marginal cost of printed books goes up, as costs increase sharply at low volumes. As a result, there will be a strong impetus for many mid-list books to be made available only as PDFs, because only top sellers will have the volume to justify printing. Either that, or print book prices will go up to make up for the increased costs.
  2. The results vary somewhat by the type of book. See my previous posting on What Job Does a Book Do? Consistent with what I wrote there, reference-oriented books have the highest percentage of PDF-only, and those that provide “fun” are still bought somewhat more in print. Ajax Design Patterns sold 67% PDF-only, while Flickr Hacks sold only 45% in PDF-only format.

There is definitely a publishing opportunity here, and we’re trying to figure out how best to seize it. The old gardening advice to “grasp the nettle firmly” seems apposite. We will definitely be offering PDF downloads soon, and would love to hear from you about what features are most important to you, and whether you would tend to buy the PDF only, or whether you’d buy a book-PDF bundle.

P.S. Obviously, our Safari Books Online service offers many of the same benefits that Marc touts — searchability and immediate access in particular — but he’s right that it doesn’t offer disconnected operation. I should point out that it does offer an additional benefit that standalone PDFs don’t offer, namely the ability to search books you don’t already own.

I’ll also point out that one of the things that’s missing for PDFs is a well-developed distribution system. As I wrote back in 1995 in an essay entitled Publishing Models for Internet Commerce, and in 2000 in a talk entitled The Ecology of EBook Publishing, “distribution systems exist for the same reason that we have alveoli in our lungs. They create surface area…. there are two classes of customers. There are the people who already know that they want your product, who can come to you directly, and then there [are] the people who are going to encounter your product by chance.” Web search engines, plus specialized book search engines like Google Book Search and Amazon and maybe even iTunes will eventually offer that kind of serendipitous discovery for eBooks that bookstores provide today, but until there is a rich distribution ecology, downloadable eBooks of any flavor will not reach their full potential.

Part of what we’re building with Safari is a channel. We have resellers in libraries, universities, and corporate settings, as well as a strong base of direct subscribers. And that channel is becoming more and more significant. In the most recent 12 months, sales of O’Reilly books through Safari exceeded sales from Borders, making it our #3 reseller behind Amazon and Barnes & Noble, with about 5 times the revenue of our direct sales from Assuming that we had the same results as Pragmatic, adding PDFs to would more than double our direct sales, but that would still leave it far behind the level of sales that we get through the distribution channel that we’ve built with Safari.

When we do offer PDFs, we will probably offer them both direct and through Safari. After all, the essence of my thinking on distribution is that more is better. (We’ve long made the same argument to bookstores about direct marketing by publishers. Lack of awareness is the biggest problem any book faces.) 51% of our Rough Cuts sales were to existing Safari subscribers, meaning that 49% came direct. However, I would love your thoughts on whether you’d prefer your PDFs through or Safari.

  • I tend to want the hardcopy *and* the PDF. I want something I know isn’t going to disappear next time my hard drive crashes, but I also want the convenience of being able to carry my library on my laptop (or even palmtop) when I’m travelling.

  • Blaine

    I may be the minority, but I’d still rather have a hard cover. I find it much easier to sit down and read.

    That said, I’d prefer both. When I sit down to read, I’d rather have a book. When I’m working and need a reference, I’d much ratehr have a PDF.

  • Very interesting (once again). I would argue that if O’Reilly books that are on Safari were also available as PDFs, then Safari itself would become more valuable to developers.

    I tried Safari, and I still track what’s there, and the “Rough Cuts” idea really sounds great. But, I found using Safari somewhat cumbersome. The amount of information on each page is relatively small, and there’s a lot of going from page to page to see if the section actually has what I’m looking for.

    Safari is great for searching across multiple books I don’t own (or that aren’t even on my Safari bookshelf) for information on a topic. I find reading a PDF better than “reading” a book on Safari.

    Realistically, I like printed books on many cases, but that’s really only truly useful if you’re going to read the book from cover to cover (which I do if it’s a topic I want to learn from scratch). For example, I recently bought “Ajax In Action”. But I do have to say, that’s the first computer/software book I’ve bought in a long time. And, I’ll admit that I bought it in part to see what a well-written “best-seller” computer book is like, since I’m trying to move into full-time writing myself…

    As I said above, I think Safari & PDFs is an ideal combination. If I search for something, and find 5 books, I may browse them using Safari. Then if I found one that looks ideal for the specific thing I need at that moment, and in browsing more of the book it looks like a book I’d like to keep with me on my laptop, a “Buy this book (PDF version)” right on Safari would seem ideal to me.

    But, it’s not a big effort to put PDFs up for sale on, right?

    With this new trend gaining momentum, I think it may make sense to offer most books that have anything to do with computers or the Internet in book form and downloadable form. The lower sales of book form may be a boon to Print-on-Demand. But, POD has print quality issues that would have to be addressed before it could be used for books that show detailed images of computer screens.

    The cost for the printed books will be higher — unless a new POD-based model can also eliminate the “returnable” feature (I mean, where bookstores can return books to the publisher). That “feature” has to account for 30% or more of the cost of a book, right? Because you can’t resell the returned books, you printed them, then you pay to dispose them…

    For computer books sold online directly to the customer, printed via POD, perhaps the cost could still be reasonable for those who want a printed book…

  • I’m a Safari subscriber, and I just got my print/PDF final versions of Perl Hacks. I really liked the whole process, and although I’m currently a Safari subscriber, I’d like to be able to continue to order PDFs even if I wasn’t. I’ll admit that I’ve downloaded many illicit PDFs of books I’ve bought, just to be able to take a huge library with me when I travel.

    Keep the PDFs coming, I think it is great and a nice supplement to Safari access for books you really want to keep/support the authors of.

  • Tom Brown

    isn’t this like saying customers prefer hardback books to softback books because during the time the hardback books are available and softback books are not, 100% choose hardbacks? i purchase pdfs on rough cuts because i don’t want to wait. however, i do not prefer pdfs. but, then again, i’m not a gentleman. :)

  • There’s another advantage to PDFs I neglected to mention when I wrote that email: PDFs integrate with other “personal search” software I use, such as Spotlight. The contrast to Safari is that I have one search for all of my documents (code, PDF, email, etc.) rather than having to try several searches (Safari, then Google, then maybe Amazon) to find what I want.

    The major disadvantage to PDFs is trying to use them on Windows. Acrobat is such bad software that I get trained to wait until I’m back on my Mac before opening a PDF — how badly do I really need that information? Apple’s Preview app is so much better and so much less obnoxious it’s amazing. Definitely a great case of less being far, far more.

  • hi tim,

    I come to regularly — been doing that for a while now.

    “However, I would love your thoughts on whether you’d prefer your PDFs through or Safari” — what difference does it make ? I recently became a safari subscriber for getting the AJAX roughcuts book which I discovered through As long as the price is the same, it usually doesn’t matter. I was forced to go to safari even though I already had a account — which might have annoyed someone else :-)



  • Marc, are you using Adobe Reader 7.07, or an earlier generation? (Reader 6.0 and previous used to load all modules at startup… 7.0 is much faster.)

    If your document renders well within Apple Preview, that’s great. If it doesn’t, then the reference renderer can help.


  • Aleks Totic

    What’s the DRM like on the latest PDFs? I was frustrated by DRM that came with my first commercial PDF purchase so much, I’ve been sticking to paper. The book I bought contacted the mothership every time I read it, and did not allow more than 10 copy operations in a week. And, you could not copy large amounts of text.

  • John, Acrobat Reader 6.01. Things I like better about Preview:

    • Never asks me to upgrade while I’m trying to read a document. (If I’m asked to upgrade, which I don’t remember happening, then it must be through Software Update, which doesn’t try to get me to install while I’m reading.) The “upgrade” prompts are extremely annoying when I’m trying to get work done (for all applications that use that system).
    • Much faster startup. (I’ll take your word for it that the new version is faster.)
    • No ads. This one is pretty much a killer for me. Every time the ad banner in the upper right changes, I miss my Mac. I’ll assume the ads are making Adobe money, but given the choice between ads and no ads, no ads are far less annoying. As a related effect, the last upgrade I did saddled me with the ads, so I’ve refused every subsequent upgrade. That’s why I have no idea the new version is faster. (Fool me once, etc.)
    • Faster, better search results. See above on how useful I find this.

    I’m sure there are a few others, but I don’t have my Mac so I’m forgetting them. My friend Nelson wrote up a post on this a while ago that I thought was right on the money.

    I don’t what you’re referring to with PDFs not rendering well in Preview. Every PDF I’ve tried looks great in it.

  • Jack Straw

    I have to agree with Brian…I’ll take the Rough Cuts PDF’s because I get them now, but I still prefer the hard copy. There’s nothing like cracking open a new O’Reilly book and breaking the spine. Also, I commute 45 minutes each way on the NYC subway…so pdfs don’t really help there.

  • Tim,

    I find this data fascinating -and, frankly, a bit surprising. Would you mind sharing the size of your customer base data pool from the 3-months of Rough Cut sales? Do you feel your Rough Cut demographic is representative of the rest of oreilly’s customers, or is it possible these “can’t-wait-till-it’s-done” preview addicts are the model early adopters whose preferences might not match the model of your larger base?

    Maybe it’s old-school thinking but I tend to believe that for the time being that consumer behavior still assigns greater value and shows a greater demand for something physical –especially when it comes to products/possessions you *hold* while experiencing them (like books). With music, the CD (or -gasp!- vinyl) itself never played a role in the consumer experience, so since it’s not something you enjoy any differently by holding it in your hand (popping it into the CD player is not substantially different than dialing up your favorite single on your iPod list), it makes sense to me intuitively that music downloads took off.

    There was an interesting thread on this idea (of physicality and consumer value props) taking an approach from the social psychology and citing different global software trends. I believe it was on that site existed called “the end of free”– but I see it’s down now. Will let you know if I can dig it up–

    By the way, as a business man are you at all troubled by the $0.99 ceiling that companies like Apple (and as of today Microsoft and MTV) have set for the price-standard for a download? What are your thoughts on pricing/value proposition?

    And to answer your question(finally!), my preference is for the book if I anticipate sharing it –and as a download if I’m working alone on a long-term project. Everytime I’ve purchased the PDF (on marketing sherpa, for example, where they’ve offered their own customer publications in both PDF/hard copy options for years) I first consider how I plan to use the information it contains.

  • I think the market for PDF distribution of books will explode once devices like the Sony Reader are widely available. I am personally looking forward to getting one of these (or another e-ink product) devices ASAP, presuming that their PDF rendering is decent. If the technology delivers on the promise, and if publishers come to the party with decently priced PDF editions of their books, I see no reason why I would ever want to buy a traditional book again?

  • I have to agree with James Webster, once I get something that has the same benefits as eInk-stuff and be able to “print” any PDF/document I want to it for easy reading in a train or plane or something, out my deadtree-versions go. Or maybe I keep them just to impress visitors of my soon-to-be-built house with library.

    Get me a device that reads as easily as printed books and has a lot of memory to store at least my favourites and reference books, and I’m hooked to PDF.

    For the time being though, I’d prefer PDF and deadtree. PDF for easy access and searchability, deadtree because reading from paper is so much easier.

    Just my 2 eurocents, though.

  • Mark Taber

    Very interesting and useful post — though I’m a bit surprised nobody has yet complained about the apparent sexism of the title. How about “Programmers Prefer PDFs”?

  • I’m one of the 5%. I would never buy a PDF of a book I expect to read cover-to-cover.

    – I can’t read more than a page or two on screen without going blind. Paper is still *way* more readable.

    – I can easily take a book anywhere – bed, beach, bathroom, doctor’s office…

    – It’s easy to skip around in a book – I can hold my place with a finger while I read another page, then go back to where I was.

    As far as reference books, I don’t buy too many of them any more. Good references are available on-line for free for most of what I do. I buy books to learn something new.

  • Tarek

    Second the PDFs are hard to read on a screen comments above.

    However, they don’t have to be, at all… Blind Shrike, written by Richard Kadrey, and typeset by John D. Berry, is _so_ nice to read onscreen, and Berry even wrote a column on dot-font about setting it. Granted, this is a piece of fiction with very different needs than a technical book, but it’s a step in the right direction.

  • I concur with Mark – a better title might be “Programmers prefer online”. Rough Cuts titles are also (and primarily) available online, even if the vast majority downloads the PDF, no?
    I’m a strong believer in digital libraries, versus electronic versions of individual books. Managing my own collection of PDF files would be a nightmare… a service like Safari does it all for me, and much more. One exception: offline reading. But is there still a future for offline? E-ink with free Wifi may be the way to go.

  • Just my rather late 2c worth (excuse the length but I hope this is of some use):

    I’ve made a similar shift from textbook-only to more use of PDFs, discussion groups and the WWW over the years.

    PDFs could have an impact on off-shore sales. Shipping and handling is a large part of the cost of acquiring a text from overseas and PDFs should have neither (?). Might be worth doing the numbers on that (I’d be curious to know what you find if you do).

    Saying that PDFs will increase the costs of the printed books assumes that PDFs are competing against the sales of the printed editions, rather than being in addition to them either as a new market or as bundled with the printed edition. For example, if PDFs were cheap enough, people may purchase the PDF of a text that they would never have bought as the (more expensive) printed version–?

    Your print runs are already fairly small aren’t they? (I know your texts go through frequent revisions.) Could one compromise be to slow the rate you revise the printed books for lower volume texts (for error corrections, etc.) compared to the PDFs, aiming for fewer, but larger print runs if PDFs prove to compete with the printed editions–? Or maybe this is too much for fans? ;-)

    I definitely need access off the internet: I get thoroughly annoyed with software vendors who assume all clients’ computers will be on the ‘net. (I keep my workstations off the ‘net for simpler security and business reasons.) Many (bio)technology companies have their workstation on a local network with varying degrees of isolation from the internet.

    I’m OK with an ID licensing the purchase to me and I can imagine simple schemes that aren’t too intrusive.

    I still prefer a good index as Acrobat’s searching is painfully slow and Spotlight’s searching only gives you the document, not the page of the match(es). Linked indices, of course.

    PDFs are excellent for Cookbooks FWIW–you can cut’n’paste the code from the PDF without having to do a separate download.

    A negative to PDFs is that its often hard to have both the PDF and the editor, IDE, etc., visible at the same time. With that in mind, I can imagine using a eBook of some kind in conjunction with my desktop machine. I can get around this by using more than one (adjacent) machine, but not everyone has that option (dual screens or wide screen formats are other options).

    I’m happy to organise large collections of PDFs, as I find the organising helps me think about the overall nature of the content. (I’m a scientist, so I suppose I have a natural organising and cataloging instinct ;-) )

  • Abhi Beckert

    Have you seen Their print-on-demand system seems to have a reasonably low price per book without having any issues when an author only sells two hard copies and 500 PDFs.

  • Firoz

    I think the type of book also affects whether you’re willing to purchase it in PDF or printed form. O’Reilly’s titles are largely computer-related, and I certainly wouldn’t mind reading these titles in PDF. One obvious convenience is having the electronic version open on your computer while you work. But some books (e.g. a novel) you might want to read away from the computer or on a more compact device than a laptop, something like an ebook reader.

    But would I want, say, a history of typography book or a book about interior design in PDF? Would I prefer titles like these to be sitting on my bookshelf rather than my computer? Yes, I’d prefer a printed copy. Technology books date more quickly – they feel more transient, which is why I don’t mind purchasing an electronic version.

    One other point about indexes. They are often better than a simple search facility. Here’s an example taken from another website I read years ago (unfortunately I can’t remember who provided the example). Let’s say, I have a book about economics. I want to find out about monopolies. I look up the word in the index. I see something like:

    I can see straight away that pages 51-54 are where I’ll find the subject discussed in the greatest detail. What happens if I do a search for “monopoly” in a PDF (using Acrobat)? Every single occurrence of the word monopoly is found regardless of it’s relevance or context. The human indexer doesn’t record every occurrence of a subject word in the index – they record only the relevant instances.

    What happens if I look up the phrase “supply and demand”?

    Well, that’s helpful isn’t it. It also shows me related topics – can a simple search manage that? What happens if I look up “labour unions”?

    Excellent. It anticipates synonyms or alternative words and points me in the direction of the preferred term. Can search manage that too? If I had simply searched for “labour unions” I probably wouldn’t have found anything.

    Okay, I better stop because I’ve rambled on too long. But I think you can see that search is not a substitute for a good index. And there are some things that even a contextual search may struggle with.