The Long Snout

Chris Anderson famously named the long tail— the idea that in the internet era, success belongs to companies that can address the end of the demand curve that is populated by millions of low-volume products, rather than a small number of high-volume products. Last year, noodling on the long tail concept, Rael Dornfest somewhat waggishly pointed out that there’s an analogous phenomenon on the front end of product creation, which he called “the long snout.” That is, there are millions of emergent products and technologies that may or may not catch on (consider the fact that there are over 100,000 projects on sourceforge alone), and that we needed a lightweight way to document and present information about those projects, so we could start publishing about them early on, and track them up the development curve as well as the demand curve.

At O’Reilly, we’ve always said that a key part of our business is watching the alpha geeks, and then building products to bring their knowledge and insights to a wider audience. We have a product pipeline that begins with O’Reilly Network articles, which can be produced relatively quickly, and can test demand for products like books and conferences that take more time and resources to produce, and we end with Safari, as a searchable, remixable database of technical content. In addition to the O’Reilly Network, over the past dozen years, we’ve been working to develop formats that allow us to address smaller units of content. Our first effort in this direction was Unix Power Tools, developed in 1993 as a book that would emulate the link-heavy reading style of the emerging World Wide Web, collecting nearly a thousand cross-referenced tips, tricks, and tools from hundreds of individual authors into a single volume. In 1998, Nat Torkington developed The Perl Cookbook, a collection of hundreds of individual programming recipes. This was followed by dozens of other programming cookbooks — including some, like The Python Cookbook, which we developed with ActiveState as a collaborative online book development project. And in 2003, Dale Dougherty and Rael Dornfest launched the Hacks series, each a collection of a hundred key tips, tricks and hacks culled from the O’Reilly Network and other online sources, or commissioned by the lead author and editor.

However, the tools and workflows that we had to build those products were distinctly old-fashioned. Each of our business units had developed its own systems for developing content, and while we’d made big strides towards integration — starting with the development of DocBook as an SGML repository format back in the late 80’s (and then migrating that format to XML) — getting data into the ultimate XML content repository that drives Safari required a lot of convoluted format conversions. And the development process was always designed to be tools-agnostic, creating a lot of complication. We’ve let authors develop books in troff, teX, Microsoft Word, Quark, Framemaker, InDesign, POD, WordPerfect, Microsoft Word — and even directly in DocBook — providing complex tools to convert back and forth from our ultimate XML repository format.

Rael decided that we needed a lightweight “Web 2.0” toolset for building the hacks books — something that would allow for collaborative, web-based authoring and editing, but would also eliminate the complex conversions that were now required by our siloed production process. He christened the tool aardvark — an animal noted for having both a long snout and a long tail — and put it to work building some of the latest hacks books. It’s a wiki-based collaborative editing front end that emits DocBook as its back-end repository format. Sweet.

But that’s not all. First off — why stop with Hacks books? Aardvark is also ideal for other types of books. And because we’ve built a whole suite of XML-based publishing tools, a rich e-commerce engine, and a huge customer base with Safari, we realized that if we integrated Aardvark into the Safari production pipeline, we could offer Safari subscribers paid early access to the books under development — not just giving them periodic PDF builds but even daily builds, if the progress of the book merits that frequency. For readers new to Safari, we’d have one more great reason for them to try it out. And by bringing our Safari partners into the program, we’d also help to popularize and spread the concept to other publishers, making it standard practice, and helping readers and publishers alike. John Chodacki, who runs our Safari conversion pipeline, and Laurie Petrycki, our VP of Publishing Operations, stepped in to build on Rael’s idea and bring it to market.

This morning, we announced one of the first fruits of this development effort: a new program for early access to books under development, called Rough Cuts. The first titles to be available include Flickr Hacks, Ajax Hacks, The Ruby Cookbook, and Ruby on Rails: Up and Running. We’ll soon be adding more titles. In addition, because we implemented this program using Safari, it’s also being used by other Safari publishers, with a number of Pearson titles to appear in the next few months, and other partners in discussions.

As to the business model, we were inspired by the success of the Pragmatic Programmers‘ recent beta books program, which sells early access at a discount from the print book price, and a combination of early access and the print book at a higher price than the print book alone. This is exactly the business model I’d worked out with Chris McAskill of FatBrain for a program we were going to roll out in 2000. Unfortunately, it was sidelined when Chris sold FatBrain to Barnes & Noble, and they shut the company down shortly thereafter. We didn’t follow up with an O’Reilly-only offering at that point because the retail computer book business was growing like gangbusters and we didn’t want to roll out a program that competed directly with our retail partners. (It was a different matter if we could partner with a pioneering retailer.) In retrospect, that was a flawed decision, as the chains and online discounters eventually destroyed the independent bookstores, and then themselves turned fickle once the dotcom bust took the growth out of computer book sales. We’re all back to having to build direct business models (which, after all, is where O’Reilly started as a publisher.)

In short, you can buy online-only access to the Rough Cuts directly from us for about 50% off the expected list price of the final book; you can pre-order the print book for about 35% off the list price; and you can buy both together for only about 10% more than the list price.

What’s so important about this business model is that it puts a significant price on online access. One of our biggest concerns as we move to an online information economy is the development of business models that cover the cost of content development. Models that treat print as primary and assign small value to an online copy, or give it away for free (as some publishers have done over the years) will end up on the trash heap once online access becomes the preferred mode (as it already is for many people.) While we also need to reduce the cost of developing content, quality doesn’t come for free.

One of the things that make the situation much more complex is that print publishing is very much a “tipping point” business. While customers naively assume that printing is a large part of the retail cost of a book, for a successful, high volume computer book, it typically represents less than 10% of the list price. Distribution costs (including retailer discount, warehousing, and physical distribution), by contrast, represent more than 60% of the list price! However, manufacturing costs skyrocket at lower sales volumes, with single-copy print-on demand costs running as high as 5 times manufacturing at scale. And while print on demand removes the capital investment required to hold thousands of copies of inventory as well as the risk of returns — this savings has the tradeoff of reducing the volume, since you no longer have access to indirect sales channels, which for most published products represent 80% or more of sales. While some products will be high demand even though they are available from a single source, others will have much lower sales because of the lack of exposure.

What’s more, print-on-demand removes the financing provided to publishers by retailers. While manufacturing thousands of copies is expensive, if those copies are ordered up front by retailers who may hold them in the channel for years, the cash flow may actually be in the publisher’s favor, with the quick cash returned by filling the channel subsidizing not only the cost of manufacturing and warehousing but also a substantial portion of the development cost. This goes away in the pay-as-you go world of print-on-demand and just-in-time inventory systems. In short, publishers are solving a complex equation that includes development cost, manufacturing cost, distribution cost, demand volume, price, and the cash flows resulting from all of those factors.

Net net, to the extent that people choose an online only option in a beta book program, print prices must go up, and in some cases, printing will become entirely un-economic (except for direct, print-on-demand sales — which fortunately we can support due to the investment we’ve made in SafariU, which includes a print-on-demand engine.) In the latter case, a larger amount of the costs must be borne by the online copy — ultimately, perhaps, all of the costs.

In our long history of making books available online, we’ve experimented with a lot of models, from free distribution to paid pdfs and online subscription. We’ve discovered that in some cases, online access builds demand, and in others, it satisfies it, and reduces the demand for print books.

For this reason, pricing is likely to remain unsettled for some time, and will vary from product to product, with real time testing of various offers to see what works best. (There isn’t one size that fits all.) Any generalizations based on a few early bestsellers may not hold up as we build out an online marketplace based on the sale of electronic access to books, especially as that marketplace moves to the front end of the development process — the long snout — as well as the low end of the demand curve — the long tail. In short, we’re all going to be learning together over the next few years. Publishers, authors, and readers will all need to learn from each other as we discover together the aardvark publishing economy.