<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>O&#039;Reilly Radar &#187; Laura Dawson</title>
	<atom:link href="http://radar.oreilly.com/laurad/feed" rel="self" type="application/rss+xml" />
	<link>http://radar.oreilly.com</link>
	<description>Insight, analysis, and research about emerging technologies</description>
	<lastBuildDate>Thu, 23 May 2013 19:58:33 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Some Tasty Bits from the StartWithXML UK Survey</title>
		<link>http://radar.oreilly.com/2009/08/some-tasty-bits-from-the-swxml.html</link>
		<comments>http://radar.oreilly.com/2009/08/some-tasty-bits-from-the-swxml.html#comments</comments>
		<pubDate>Thu, 06 Aug 2009 01:53:34 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[digital content]]></category>
		<category><![CDATA[digital workflow]]></category>
		<category><![CDATA[publishers]]></category>
		<category><![CDATA[startwithxml]]></category>
		<category><![CDATA[startwithxml survey]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2009/08/some-tasty-bits-from-the-swxml.html</guid>
		<description><![CDATA[We&apos;ve got some raw results from the StartWithXML survey in the UK, and they are very different in some respects from the US survey we did. Some salient points:48.7% of the respondents were in the STM market, followed by trade (24.4%) and college (16%).The bulk of respondents were from large houses - 50.4% - and the rest were evenly divided... ]]></description>
				<content:encoded><![CDATA[<p>We&#8217;ve got some raw results from the StartWithXML survey in the UK, and they are very different in some respects from the US survey we did. Some salient points:</p>
<ul>
<li>48.7% of the respondents were in the STM market, followed by trade (24.4%) and college (16%).</li>
<li>The bulk of respondents were from large houses &#8211; 50.4% &#8211; and the rest were evenly divided between midsized and small presses.</li>
<li>Nearly 55% of the respondents considered themselves &#8220;tech-proficient.&#8221; As most of them were from production or management, this was not surprising. We did have a significant number of editorial respondents, however &#8211; 19.3%.</li>
<li>To 40.6% of our respondents, digital publishing is &#8220;very important &#8211; it informs all we do.&#8221; Meanwhile, 59.4% of respondents are grappling with its impact in their companies. Only 17.8% of respondents say that they do not focus on the downstream uses of their book content, but on the print volume alone.</li>
<li>As far as expanded editions are concerned, 53.5% of publishers say they don&#8217;t offer these. And 69.3% do not offer more than the basic ONIX marketing content (cover image, description, first chapter, table of contents) in their digital marketing efforts.</li>
<li>Over 73% of publishers do not have a formalized (formalised, if you&#8217;re in the UK) DAM system. </li>
<li>And over 50% do not maintain files in an XML format.</li>
<li>Nearly 69% of respondents have problems retrieving files from storage, and have to institute workarounds. But over 56% look at XML as a way of complementing CMS and DAM tools they have already invested in.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2009/08/some-tasty-bits-from-the-swxml.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CSS in an XML Workflow</title>
		<link>http://radar.oreilly.com/2009/07/css-in-an-xml-workflow.html</link>
		<comments>http://radar.oreilly.com/2009/07/css-in-an-xml-workflow.html#comments</comments>
		<pubDate>Fri, 17 Jul 2009 20:28:00 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[css]]></category>
		<category><![CDATA[production]]></category>
		<category><![CDATA[startwithxml]]></category>
		<category><![CDATA[stylesheets]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2009/07/css-in-an-xml-workflow.html</guid>
		<description><![CDATA[At the StartWithXML Forum in New York in January, Rebecca Goldthwaite of Cengage gave a great demonstration of how Cengage uses CSS in their XML workflow. Many publishers regard style sheets as an invitation to create cookie-cutter book production, with the fear that all their books will look the same. This is emphatically a myth. Have a look at her... ]]></description>
				<content:encoded><![CDATA[<p>At the StartWithXML Forum in New York in January, Rebecca Goldthwaite of Cengage gave a <a href="http://www.csszengarden.com/?cssfile=202/202.css">great demonstration</a> of how Cengage uses CSS in their XML workflow. Many publishers regard style sheets as an invitation to create cookie-cutter book production, with the fear that all their books will look the same. This is emphatically a myth. Have a <a href="http://www.csszengarden.com/?cssfile=202/202.css">look</a> at her seventh slide for examples of how one stylesheet can actually create many different looks.</p>
<p><a href="http://csszengarden.com/">CSS Zen Garden</a> has been up for a while (Liza Daly used this model to create the <a href="http://epubzengarden.com/#/static/middlemarch/OEBPS/chapter1.html">EPUB Zen Garden</a> a few months ago). It&#8217;s a sort of CSS sandbox where graphic designers can play with style sheets and render the same content in very different forms. Clicking on the four links below will demonstrate what CSS can do:</p>
<ul>
<li><a href="http://www.csszengarden.com/?cssfile=/209/209.css&amp;page=0">Home Page 1</a></li>
<li><a href="http://www.csszengarden.com/?cssfile=/209/209.css&amp;page=0">Home Page 2</a></li>
<li><a href="http://www.csszengarden.com/?cssfile=202/202.css">Home Page 3</a></li>
<li><a href="http://www.csszengarden.com/?cssfile=193/193.css">Home Page 4</a></li>
</ul>
<p>It&#8217;s well worth checking out and maybe having some graphic designers play around with it.</p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2009/07/css-in-an-xml-workflow.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>StartWithXML is Going to London</title>
		<link>http://radar.oreilly.com/2009/07/swxml-london.html</link>
		<comments>http://radar.oreilly.com/2009/07/swxml-london.html#comments</comments>
		<pubDate>Fri, 17 Jul 2009 18:10:22 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[startwithxml]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2009/07/swxml-london.html</guid>
		<description><![CDATA[StartWithXML will be continuing in London! On September 2nd, at the British Library, we&apos;ll be conducting a one-day forum similar to the one we held in New York last January, but with a British publishing focus. Our sponsors for this event include Klopotek, MarkLogic, PLS, BIC, Publishers&apos; Association, and of course O&apos;Reilly. We&apos;re still in the process of firming up... ]]></description>
				<content:encoded><![CDATA[<p>StartWithXML will be continuing in London! On September 2nd, at the British Library, we&#8217;ll be conducting a one-day forum similar to the one we held in New York last January, but with a British publishing focus. Our sponsors for this event include Klopotek, MarkLogic, PLS, BIC, Publishers&#8217; Association, and of course O&#8217;Reilly. </p>
<p>
We&#8217;re still in the process of firming up our speakers, but we do have information posted <a href="http://www.pls.org.uk/ngen_public/article.asp?aid=536">here</a>. Additionally, if you are a British publisher or service provider, there&#8217;s a survey for you <a href="http://www.surveymonkey.com/s.aspx?sm=wAUoAZuCDuPqYOxLHf4yZw_3d_3d">here</a>.</p>
<p>As we get more news, we&#8217;ll add it here &#8211; meanwhile, we&#8217;re continuing to research and gather information about where publishers are in the StartWithXML process.</p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2009/07/swxml-london.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Taxonomies and Starting With XML</title>
		<link>http://radar.oreilly.com/2009/02/taxonomies-and-starting-with-x.html</link>
		<comments>http://radar.oreilly.com/2009/02/taxonomies-and-starting-with-x.html#comments</comments>
		<pubDate>Wed, 25 Feb 2009 18:00:00 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[startwithxml]]></category>
		<category><![CDATA[tagging]]></category>
		<category><![CDATA[tags]]></category>
		<category><![CDATA[taxonomy]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2009/02/taxonomies-and-starting-with-x.html</guid>
		<description><![CDATA[This is an excerpt from a blog post I wrote last week on taxonomies and chunking. Last October, the StartWithXML team wrote a post called &#34;To Chunk or Not To Chunk,&#34; where we discussed tagging and infrastructure issues, and a discussion ensued about what happens when you don&apos;t know what you&apos;ll be using chunks for. How do you tag those?... ]]></description>
				<content:encoded><![CDATA[<p><i>This is an excerpt from a <a href="http://www.ljndawson.com/permalink/2009/02/17/Taxonomies_Now.html">blog post I wrote last week</a> on taxonomies and chunking.</i></p>
<p>Last October, the <a href="http://toc.oreilly.com/startwithxml">StartWithXML</a> team wrote a post called &#8220;<a href="http://toc.oreilly.com/2008/10/to-chunk-or-not-to-chunk.html">To Chunk or Not To Chunk,</a>&#8221; where we discussed tagging and infrastructure issues, and a discussion ensued about what happens when you don&#8217;t know what you&#8217;ll be using chunks for. How do you tag those?</p>
<p>Later, in our StartwithXML <a href="http://toc.oreilly.com/2009/01/presentations-from-the-startwi.html">One-Day Forum</a>, we included a <a href="http://www.slideshare.net/toc/the-evolving-role-of-authors-and-editors-presentation?type=powerpoint">presentation</a> on tagging and chunking best practices, where it was pointed out that no taxonomy for chunk-level content currently exists.</p>
<p>We have taxonomies for book-level content. These include formalized code sets such as the<a href="http://www.loc.gov/cds/lcsh.html">Library of Congress subject codes</a>, the <a href="http://www.bisg.org/publications/bisac_subj.html">BISAC codes</a>, the <a href="http://www.oclc.org/dewey/resources/summaries/default.htm">Dewey Decimal System</a>, among others. There are also informal code sets, like the tag sets on <a href="http://www.shelfari.com/books/subjects?t=d">Shelfari</a> or <a href="http://www.librarything.com/zeitgeist">Library Thing</a>. There are proprietary taxonomies at <a href="http://www.amazon.com/books-used-books-textbooks/b/ref=sa_menu_bo0?ie=UTF8&amp;node=283155&amp;pf_rd_p=328655101&amp;pf_rd_s=left-nav-1&amp;pf_rd_t=101&amp;pf_rd_i=507846&amp;pf_rd_m=ATVPDKIKX0DER&amp;pf_rd_r=0CA3SGB8CBVS7AGS9C7T">Amazon</a> and <a href="http://www.barnesandnoble.com/subjects/subjects.asp">B&amp;N.com</a> that enable effective browsing.</p>
<p>But nothing like this exists for sub-book-level content. It&#8217;s never been traded before. We&#8217;ve never really needed a taxonomy for it before.</p>
<p>Other industries that traditionally distribute &#8220;chunks&#8221; have their own taxonomies that might prove useful in building a book-chunk schema. These include the <a href="http://www.iptc.org/cms/site/index.html?channel=CH0088">IPTC news codes</a>,<br />
which identify the content of a particular news story &#8212; that&#8217;s the closest analogy I can find for small gobbets of content that require organization.</p>
<p>Industries have proprietary taxonomies to identify certain concepts &#8212; culinary arts, music, agriculture, engineering, the sciences, literature and criticism, education, and on and on and on.<br />
But these do not necessarily identify concepts within a book.</p>
<p>Some might argue that we don&#8217;t necessarily need taxonomies &#8212; why can&#8217;t we use natural-language search and the semantic Web to &#8220;bubble up&#8221; the &#8220;right&#8221; concepts? I&#8217;d argue that words don&#8217;t always mean what we think they mean. A classic example from my library days is the term &#8220;mercury.&#8221; That could mean the planet, the car or the element. Proponents of semantic search would say that the context in which &#8220;mercury&#8221; is mentioned should take care of defining that term. I&#8217;d say that&#8217;s true in about 50 percent of all cases but not definitively true enough in 75-100%.</p>
<p>My <a href="http://www.ljndawson.com/permalink/2009/02/17/Taxonomies_Now.html">original post gets into more detail</a> about why taxonomies are important search tools, and how the digitization of books requires a good taxonomy &#8230; and <a href="http://www.bisg.org/">who should do it</a>.</p>
<p class="related">Related Stories:</p>
<ul class="related">
<li> <a href="http://toc.oreilly.com/2008/11/beyond-the-tag-cloud.html">Beyond the Tag Cloud</a></li>
<li> <a href="http://toc.oreilly.com/2008/04/simplifying-semantic-tagging.html">Simplifying Semantic Tagging</a></li>
<li> <a href="http://toc.oreilly.com/2008/09/library-uses-tags-to-link-onli.html">Library Uses Tags to Link Online-Offline Recommendations</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2009/02/taxonomies-and-starting-with-x.html/feed</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Coverage of StartWithXML</title>
		<link>http://radar.oreilly.com/2009/01/coverage-of-swxml.html</link>
		<comments>http://radar.oreilly.com/2009/01/coverage-of-swxml.html#comments</comments>
		<pubDate>Thu, 15 Jan 2009 22:20:07 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[books]]></category>
		<category><![CDATA[publishing]]></category>
		<category><![CDATA[startwithxml]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2009/01/coverage-of-swxml.html</guid>
		<description><![CDATA[Turns out I was not the only one on Twitter for the StartwithXML Forum on January 13th. Joe Bachana was tweeting as well. Kind of interesting to see the posts side-by-side. David Rothman of Teleread also has some great things to say, as does Richard Curtis over at e-reads. We also got nice coverage from PW, as well as Publishers... ]]></description>
				<content:encoded><![CDATA[<p>Turns out I was not the only one on <a href="http://search.twitter.com/search?q=%23startwithxml">Twitter for the StartwithXML Forum</a> on January 13th. Joe Bachana was tweeting as well. Kind of interesting to see the posts <a href="http://search.twitter.com/search?q=%23startwithxml">side-by-side</a>. David Rothman of Teleread also has some <a href="http://www.teleread.org/blog/2009/01/05/xml-workflow-conference-learn-how-to-cope-with-both-e-and-p/">great things to say</a>, as does Richard Curtis over at <a href="http://www.ereads.com/2009/01/publishing-people-dip-toe-in-xml.html">e-reads</a>.</p>
<p>We also got nice coverage from <a href="http://www.publishersweekly.com/article/CA6629176.html?nid=2286&amp;source=link&amp;rid=182425213">PW</a>, as well as Publishers Lunch.</p>
<p>Slides will be up soon!</p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2009/01/coverage-of-swxml.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A Correction!</title>
		<link>http://radar.oreilly.com/2008/11/a-correction.html</link>
		<comments>http://radar.oreilly.com/2008/11/a-correction.html#comments</comments>
		<pubDate>Wed, 26 Nov 2008 19:05:25 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[content]]></category>
		<category><![CDATA[startwithxml]]></category>
		<category><![CDATA[tagging]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2008/11/a-correction.html</guid>
		<description><![CDATA[Frank Grazioli, of Wiley, writes in to correct my last post about taxonomies: Wiley has been exploring taxonomies for its travel content business; the cooking/psych/accounting spaces might be our next logical opportunities because the disciplines are well developed, specific, etc., that content is authored or edited in fairly controlled templates that map to our own XML content models and our... ]]></description>
				<content:encoded><![CDATA[<p>Frank Grazioli, of Wiley, writes in to correct <a href="http://toc.oreilly.com/2008/11/beyond-the-tag-cloud.html">my last post</a> about taxonomies:</p>
<blockquote>
<p>Wiley has been exploring taxonomies for its travel content business; the cooking/psych/accounting spaces might be our next logical opportunities because the disciplines are well developed, specific, etc., that content is authored or edited in fairly controlled templates that map to our own XML content models and our belief in content models and XML has evolved that &#8220;lighter&#8221; and &#8220;more agile&#8221; are better than taggy and dense. As you so aptly point to the contextuality and &#8220;rigor&#8221; of taxonomies, these tools would allow our XML to &#8220;slip on the right jacket&#8221; for the occasion. I apologize if we led you to believe that we already have firm taxonomies in place for the three areas you specify&#8211;I wouldn&#8217;t want readers/event guests to get that impression anyway. </p>
</blockquote>
<p class="related">Related:</p>
<ul class="related">
<li> <a href="http://toc.oreilly.com/startwithxml">See more StartWithXML posts</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2008/11/a-correction.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Beyond the Tag Cloud</title>
		<link>http://radar.oreilly.com/2008/11/beyond-the-tag-cloud.html</link>
		<comments>http://radar.oreilly.com/2008/11/beyond-the-tag-cloud.html#comments</comments>
		<pubDate>Tue, 11 Nov 2008 16:00:00 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[startwithxml]]></category>
		<category><![CDATA[tagging]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2008/11/beyond-the-tag-cloud.html</guid>
		<description><![CDATA[This is an excerpt from our research paper, which will publish in concert with the StartWithXML Forum on January 13th at the McGraw-Hill Auditorium in New York. Early bird discounting for BISG members is ending soon! A good taxonomy is the backbone of your business -- it&apos;s how you sort your content. It allows for effective merchandising, effective marketing --... ]]></description>
				<content:encoded><![CDATA[<p><i>This is an excerpt from our <a href="http://toc.oreilly.com/startwithxml/why-and-how.html">research paper</a>, which will publish in concert with the <a href="http://toc.oreilly.com/startwithxml/register.html">StartWithXML Forum</a> on January 13th at the McGraw-Hill Auditorium in New York. Early bird discounting for BISG members is ending soon!</i></p>
<p>A good taxonomy is the backbone of your business &#8212; it&#8217;s how you sort your content. It allows for effective merchandising, effective marketing &#8212; you can aim your content with the precision of a pool cue. It allows for inventorying your content &#8212; so you know what you have &#8230; and what you need. With your content tagged and organized, you know where everything is and how to deploy it.</p>
<p>Taxonomies are contextually sensitive and rigorous &#8212; and in establishing your own, it helps to look at what other industries are doing. <a href="http://www.wiley.com">Wiley</a> has adopted accounting and cooking and psychology taxonomies from those industries to organize information in its professional development titles. Educational publishers are increasingly arranging their textbooks around &#8220;learning objects&#8221; &#8212; taxonomized pedagogical goals developed by educators themselves. Even the <a href="http://www.bisg.org/bisac/subjectcodes/index.html">BISAC</a> codes &#8212; which are part of the <a href="http://www.bisg.org/onix/onix_downloads.html">ONIX</a> system of  organizing book information and therefore an XML-based taxonomy &#8212; are developed very carefully and consensually among book industry professionals in monthly meetings.</p>
<p>An important aspect of taxonomy development is scope notes. Terms need definition and clarity around how they&#8217;re going to be used. Documenting your taxonomy &#8212; what you mean when you say &#8220;porcelain&#8221; (collectible china, dental work, household fixtures?), parent-child relationships between categories, and why you choose certain terms over others &#8212; is important for the long term. Future editors and authors will need to know why your taxonomy has developed as it has.</p>
<p>Consistency in application is also crucial. Drop-down menus (as opposed to free-text fields) enforce structure and ensure that users don&#8217;t come up with their own terms that pollute your taxonomy with duplicates or irrelevancies (or misspellings).</p>
<p>An advantage to using XML is that you don&#8217;t have to accomplish everything at once, perfectly, from the outset. You will not be able to tag your documents thoroughly right off the bat &#8212; who can know everything in advance? The act of tagging is recursive, and depends on market and company needs. XML allows for this flexibility. Depending on how you envision chunking and re-use, you&#8217;ll tag your documents differently with each iteration. Unlike the &#8220;fire and forget&#8221; model, iterative tagging means that your books are living documents.</p>
<p class="related">Related:</p>
<ul class="related">
<li> <a href="http://toc.oreilly.com/startwithxml/why-and-how.html">StartWithXML Research Report</a></li>
<li> <a href="http://toc.oreilly.com/startwithxml">See more StartWithXML posts</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2008/11/beyond-the-tag-cloud.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>To Chunk or Not To Chunk?</title>
		<link>http://radar.oreilly.com/2008/10/to-chunk-or-not-to-chunk.html</link>
		<comments>http://radar.oreilly.com/2008/10/to-chunk-or-not-to-chunk.html#comments</comments>
		<pubDate>Thu, 16 Oct 2008 20:11:59 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[academic]]></category>
		<category><![CDATA[chunking]]></category>
		<category><![CDATA[ebooks]]></category>
		<category><![CDATA[recipes]]></category>
		<category><![CDATA[repurposed content]]></category>
		<category><![CDATA[startwithxml]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2008/10/to-chunk-or-not-to-chunk.html</guid>
		<description><![CDATA[This is excerpted from a column I wrote for the most recent issue of The Big Picture, my free newsletter about technology and the book industry. As we&apos;re proceeding with Start With XML, I&apos;m thinking a lot about chunking. Chunking, at least as we&apos;re talking about it, means carving up your content into chunks and distributing those discrete pieces of... ]]></description>
				<content:encoded><![CDATA[<p>This is excerpted from a column I wrote for the most recent issue of The Big Picture, <a href="http://www.ljndawson.com/The_Big_Picture/Newsletter_Subscription/">my free newsletter</a> about technology and the book industry.</p>
<blockquote>
<p>As we&#8217;re proceeding with <a href="http://toc.oreilly.com/startwithxml/">Start With XML</a>, I&#8217;m thinking a lot about chunking.</p>
<p>Chunking, at least as we&#8217;re talking about it, means carving up your content into chunks and distributing those discrete pieces of it. <a href="http://www.frommers.com">Travel content</a> (distributed over GPS, the web, and in book form) and recipes (distributed via <a href="http://www.epicurious.com">Epicurious</a> and <a href="http://www.allrecipes.com">AllRecipes.com</a> as well as <a href="http://www.sharedbook.com">in book form</a>) are the most obvious examples of this. <a href="http://shopmcgraw-hill.com">Textbook publishing</a> does this as well &#8211; certain assets can be used <a href="http://www.mhprofessional.com/mhhe_product.php?cat=108&amp;isbn=0073534420">in the main text,</a> in <a href="http://www.mhprofessional.com/mhhe_product.php?cat=108&amp;isbn=0077288211">supplementary workbooks and lab manuals</a>, as individual activities to be downloaded to an iPod, or <a href="http://www.vitalsource.com/">embedded in e-books</a>.</p>
<p>And as we talk about chunking, it&#8217;s clear that there are certain types of content that don&#8217;t immediately lend themselves to that kind of carved-up distribution. Novels, for example. Narrative nonfiction such as memoirs. Philosophical or political works, where tracing the author&#8217;s thought from beginning to end is important.</p>
<p>The truth is, we may not <i>quite</i> know what will chunk readily and what will not. There are some blue-sky ideas right now &#8211; tagging content within narratives, to be pulled out later and stand on its own &#8211; but we just don&#8217;t know yet if readers are interested in that kind of thing.</p>
<p>But publishers can&#8217;t afford NOT to prepare for the unknown. There has never been uncertainty like this in publishing &#8211; uncertainty in stock prices and supply chain issues (paper prices, transportation/shipping costs, the costs of composition and conversion), uncertainty in revenue-generation, uncertainty as to who&#8217;s going to buy what in which format &#8211; and it&#8217;s not going to get any clearer for quite some time. </p>
<p>And you can&#8217;t chunk at all if you haven&#8217;t tagged &#8211; you can&#8217;t even begin to think about chunking if you haven&#8217;t tagged. Tagging is never a bad strategy &#8211; you will never regret doing it. But the risk of NOT doing it &#8211; the risk of not being ready for the next wave of consumer demand <i>whatever that demand may be</i> &#8211; means that you can&#8217;t afford not to do it.</p>
</blockquote>
<p class="related">Related:</p>
<ul class="related">
<li> <a href="http://toc.oreilly.com/startwithxml">See more StartWithXML posts</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2008/10/to-chunk-or-not-to-chunk.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Standardizing Tags in the Metadata Minefield</title>
		<link>http://radar.oreilly.com/2008/10/metadata-minefield.html</link>
		<comments>http://radar.oreilly.com/2008/10/metadata-minefield.html#comments</comments>
		<pubDate>Tue, 14 Oct 2008 22:39:05 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[startwithxml]]></category>
		<category><![CDATA[tags]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2008/10/metadata-minefield.html</guid>
		<description><![CDATA[One issue we haven&apos;t discussed much is that of metadata. XML documents are by definition rife with metadata. At what point does metadata cross the line from useful to pollution? When it&apos;s not standardized. The kind of XML tagging we&apos;re primarily talking about can be sectioned into three buckets: rights data (&#34;this picture is good for print products but not... ]]></description>
				<content:encoded><![CDATA[<p>One issue we haven&#8217;t discussed much is that of metadata. XML documents are by definition rife with metadata. At what point does metadata cross the line from useful to pollution?</p>
<p>When it&#8217;s not standardized.</p>
<p>The kind of XML tagging we&#8217;re primarily talking about can be sectioned into three buckets: rights data (&#8220;this picture is good for print products but not electronic ones,&#8221; &#8220;we can use this graphic anywhere,&#8221; &#8220;these animations are exclusively for the workbook&#8221;), formatting data (&#8220;this is a chapter,&#8221; &#8220;this is a footnote&#8221;), and context data (&#8220;Paris,&#8221; &#8220;1955,&#8221; &#8220;General Robert E. Lee,&#8221; &#8220;noodles&#8221;).</p>
<p>This is a perfect recipe for complete chaos. Obviously standards are crucial to the success of using XML in publishing. Even standards within a department &#8212; using tags the same way from one project to the next, from one PERSON to the next &#8212; are crucial. </p>
<p>There&#8217;s been some talk about the role of the <a href="http://www.bisg.org">Book Industry Study Group</a> in developing tagging standards, in the same way they&#8217;ve developed <a href="http://www.bisg.org/publications/bisac_subj.html">BISAC code standards</a>. And this makes a great deal of sense. The rights and formatting tag standards should be relatively easy to establish &#8212; publishing houses, no matter whether big or small, tend to use this data fairly consistently. It&#8217;s the context tags that pose the more serious challenges.</p>
<p><a href="http://authorities.loc.gov/">Library of Congress</a> has done this sort of thing with its subject headings. But, like the BISAC codes, these refer to the subject <i>of an entire book</i>. Many books, however, are comprised of more than one topic &#8211; many <i>chapters</i> are comprised of more than one topic. That level of granularity has never been taxonomized before. </p>
<p>Still, it&#8217;s important to do so in a standardized way, to avoid a cacophony that drowns out meaning. (Is it &#8220;pasta&#8221; or &#8220;noodles&#8221;? When you say &#8220;diamond,&#8221; are you talking about baseball or gemstones or Neil? Why is a chapter published by Mosby about dentistry coming up in search results with the chapters on collecting Limoges china published by Antique Trader? Hint: &#8220;porcelain.&#8221;)</p>
<p>If you&#8217;ve ever seen a <a href="http://www.librarything.com/tagcloud.php">tag cloud</a> on a website, you&#8217;ll know what I mean. You never know what you&#8217;re going to get when you click on it. Standardizing context tags is probably the most thankless, boring job publishers will ever engage in. But it&#8217;s also the one that&#8217;s going to ensure that books are actually discoverable the way they&#8217;re meant to be discovered.</p>
<p class="related">Related:</p>
<ul class="related">
<li> <a href="http://toc.oreilly.com/startwithxml">See more StartWithXML posts</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2008/10/metadata-minefield.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>What We Talk About When We Talk About XML (Apologies to Raymond Carver)</title>
		<link>http://radar.oreilly.com/2008/09/what-we-talk-about-when-we-tal.html</link>
		<comments>http://radar.oreilly.com/2008/09/what-we-talk-about-when-we-tal.html#comments</comments>
		<pubDate>Tue, 30 Sep 2008 16:49:00 +0000</pubDate>
		<dc:creator>Laura Dawson</dc:creator>
				<category><![CDATA[Publishing]]></category>
		<category><![CDATA[authoring]]></category>
		<category><![CDATA[editing]]></category>
		<category><![CDATA[Marketing]]></category>
		<category><![CDATA[production]]></category>
		<category><![CDATA[startwithxml]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2008/09/what-we-talk-about-when-we-tal.html</guid>
		<description><![CDATA[Acronyms and initialisms are mysterious and potent, and frequently hide meaning and become shorthand for larger concepts. Just as ONIX became shorthand for &#34;metadata,, XML (at least in book publishing land) is becoming shorthand for ... well, a lot of things. Repurposing content, creating templates for book design, tagging -- all of these are encompassed in the term &#34;XML workflow.&#34;... ]]></description>
				<content:encoded><![CDATA[<p>Acronyms and initialisms are mysterious and potent, and frequently hide meaning and become shorthand for larger concepts. Just as <a href="http://www.editeur.org/onix.html">ONIX</a> became shorthand for &#8220;metadata,, XML (at least in book publishing land) is becoming shorthand for &#8230; well, a lot of things. Repurposing content, creating templates for book design, tagging &#8212; all of these are encompassed in the term &#8220;XML workflow.&#8221;</p>
<p>So no wonder people get confused. Particularly people who are in the business of creating content, not formatting, categorizing, packaging and marketing it.</p>
<p>So what are we talking about when we&#8217;re throwing around this term? It depends on what you do for a living.</p>
<p>If you&#8217;re a writer, it might mean <a href="http://news.oreilly.com/2008/08/importing-xml-documents-into-w.html">using Word a little differently</a>, quite possibly according to specific author guidelines given to you by the publisher. It might also mean including lists of keywords along with your manuscript. It may mean including lists of keywords for each chapter.</p>
<p>If you&#8217;re an acquisitions editor, an XML workflow may mean deciding whether you want a book to merely exist as a print product (as a single source of revenue), or whether it&#8217;s also appropriate as an <a href="http://www.amazon.com/b/ref=topnav_storetab_kinh?ie=UTF8&amp;node=133141011">ebook</a>, to sell by the chapter (as numerous textbook publishers are doing), to publish iteratively (as O&#8217;Reilly does with its <a href="http://oreilly.com/roughcuts/faq.csp">Rough Cuts</a>), to make <a href="http://software.newsstand.com/bookrdr/hbg-live/BookBrowse.html?a=z7c8YUAy3v2X7XcCVaM6lwItJhoV9%2Feb%2F8FccjFJPUyuTzXUHvYXc7rV2fh0Hq7RnjIa%2FM6yHR0tIvCgPkrdSc7wwOe4LsmB2asdMzJtAYs7TVOtxvsdUMQX0YrFB0VZ&amp;z=hbg">excerpts available for free download</a>, etc.</p>
<p>If you&#8217;re a book production editor, <a href="http://authornet.cambridge.org/information/productionguide/stm/XML_workflow.asp">an XML workflow will be very concrete </a> &#8212; you tag a manuscript according to its format (&#8220;chapter heading,&#8221; &#8220;illustration,&#8221; &#8220;copyright page&#8221;), and those tags are applied to a pre-defined style sheet.</p>
<p>If you&#8217;re in marketing, an XML workflow allows you to work with the author&#8217;s keywords, <a href="http://www.epicurious.com/recipes/food/views/SIMPLE-VEAL-PASTA-SAUCE-15070">target specific audiences for the content</a>, and <a href="http://cengagesites.com/academic/?site=1392">package the content in appealing ways</a>.</p>
<p>Could you do all of this without XML? Sure. You could use a relational database and shove your manuscript, chapter by chapter, into tables in SQL. You could assign keywords in a relational database. But you couldn&#8217;t do formatting. You could use InDesign or Quark to do your formatting. But you couldn&#8217;t break up your manuscript into &#8220;chunks&#8221; and repackage those &#8220;chunks&#8221; into new products with those programs. XML has the capacity to handle both, and handle them well.</p>
<p>Like most acronyms, XML is a tool. It&#8217;s not a goal in itself, but a way to get to your goal.</p>
<p class="related">Related:</p>
<ul class="related">
<li> <a href="http://toc.oreilly.com/2008/09/why-you-should-care-about-xml.html">Why You Should Care About XML</a></li>
<li> <a href="http://toc.oreilly.com/2008/09/visualizing-the-advantages-of.html">Visualizing the Advantages of StartWithXML</a></li>
<li> <a href="http://toc.oreilly.com/2008/09/chunks-and-verticals-and-niche.html">Chunks and Verticals and Niches &#8212; Oh, My!</a></li>
<li> <a href="http://toc.oreilly.com/startwithxml">See more StartWithXML posts</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2008/09/what-we-talk-about-when-we-tal.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
