Balancing the Benefits and Costs of XML for Book Production

O’Reilly engineer and XML guru Keith Fahlgren kicked off a lively conversation on an internal mailing list this week by asking whether (and how much) we’re “eating our own dogfood” in terms of Tim O’Reilly’s recent post about IT.

Along the way, editor Kurt Cagle weighed in with his thoughts on the importance of an XML workflow (specifically one that plays nicely with his needs running a destination Web site):

Overall, I’d like to see us move to an all XML pipeline, not because I’m the XML editor (I’m actually writing more economics articles of late than anything) but because I think that a cohesive XML workflow provides us with the cleanest implementations that we can have, and ironically it’s the one type of flow that may actually make it easier us to work with the content without needing to break open the content to do tedious search and replace operations. It provides the best reuse story — it’s a relatively simple proposition to convert a DocBook publication into an embedded Web block, for instance — and it integrates well with feed production.

O’Reilly Publishing Services Manager Adam Witwer responded, and included some critical lessons learned about the challenges with moving to XML:

Over the past year or so, we in publishing services have adopted an all DocBook XML pipeline for several of the main book series (Animal, Cookbook, Theory in Practice, In a Nutshell, etc.). Retraining staff has been a huge challenge. From a technical perspective, developing the XSL-FO has taken (and continues to take) lots of time and iterations. But the biggest challenge has been convincing others that the small sacrifices that come with an XML workflow are worth it. We have less control over things like page layout in a book, and certain style elements that are easy in InDesign or Frame are difficult to replicate with XSL-FO stylesheets. For us in publishing services, those things seem like small trade-offs for the gain of having a single set of source files that are much easier to reuse, most notably on Safari, and to update. This ceases to be an issue when the stylesheets get to be nearly indistinguishable from the InDesign/Frame templates on which they are based, so that’s what we’ve tried to do, and we’ve transitioned away from the traditional page layout programs and general approach to book production.

It’s worth noting that this all applies to what happens after we receive a manuscript, many of which are still being written in Word. There’s a lot that can be done without ever opening the can of worms that is authoring and in XML.

tags: , ,