Publishing

Technology is transforming publishing. From the way ideas are generated to the packaging of information to the delivery of products, the industry is in the midst of a sea change. We've always considered O'Reilly as much of a technology company as a publisher, a belief that's led us to develop information products such as GNN (the first commercial website), Safari Books Online, and the Tools of Change for Publishing conference. As publishers seek a new equilibrium in our networked world, we aim to be both a catalyst and chronicler of what has inevitably been called Publishing 2.0.


 

Recent Posts from TOC

Tue

Oct 20
2009

Nat Torkington

Four short links: 20 October 2009

Politics in The Age of Social Software, Ethernet Patents, Free Book Fear, Programming Exercises

by Nat Torkington@gnatcomments: 7

  1. Poles, Politeness, and Politics in the Age of Twitter (Stephen Fry) -- begins with a discussion of a UK storm but rapidly turns into a discussion of fame in the age of Twitter, modern political discourse, the "deadwood press", and The Commons in Twitter Assembled. There is an energy abroad in the kingdom, one that yearns for a new openness in our rule making, our justice system and our administration. Do not imagine for a minute that I am saying Twitter is it. Its very name is the clue to its foundation and meaning. It is not, as I have pointed out before, called Ponder or Debate. It is called Twitter. But there again some of the most influential publications of the eighteenth century had titles like Tatler, Rambler, Idler and Spectator. Hardly suggestive of earnest political intent either. History has a habit of choosing the least prepossessing vessels to be agents of change.
  2. Apple and Others Hit With Lawsuit Over 90s Ethernet Patents -- unclear whether the plaintiff is 3Com (who filed the patents) or a troll who bought them. "We strongly believe that 3Com’s Ethernet technologies are being regularly infringed by foreign and some US companies," said David A. Kennedy, Chief Executive Officer of U.S. Ethernet Innovations. "We believe that the continued aggressive enforcement of the fundamental Ethernet technologies developed by 3Com against the waves of cheap, knock-off, foreign manufactured equipment is a necessary step in protecting the competitiveness of this American technology and American companies in general." (via Slashdot)
  3. The Point -- someone's publishing Mark Pilgrim's "Dive into Python", which was published by APress under an open content license. Naturally this freaked out APress (it's easy to imagine many eyelids would tic nervously should such a thing happen with one of O'Reilly's open-licensed books). Mark's response is fantastic. Part of choosing a Free license for your own work is accepting that people may use it in ways you disapprove of. There are no “field of use” restrictions, and there are no “commercial use” restrictions either. In fact, those are two of the fundamental tenets of the “Free” in Free Software. If “others profiting from my work” is something you seek to avoid, then Free Software is not for you. Opt for a Creative Commons “Non-Commercial” license, or a “personal use only” freeware license, or a traditional End User License Agreement. Free Software doesn’t have “end users.” That’s kind of the point.
  4. Programming Praxis -- programming exercises to keep your skills razor-sharp, with solutions.

tags: free, patent, politics, programming, publishing, social software, twittercomments: 7
submit: Reddit Digg stumbleupon   

 

Thu

Sep 24
2009

Tim O'Reilly

Microsoft Press Enters Strategic Alliance with O'Reilly

by Tim O'Reilly@timoreillycomments: 32

Today, Microsoft and O'Reilly Media announced an agreement to support and expand Microsoft Press. Under the terms of the strategic alliance, O'Reilly will be the exclusive distributor of Microsoft Press titles and co-publisher of all Microsoft Press titles, on Nov. 30, 2009. We'll be working with Microsoft to develop new books, as well as distributing both existing and new co-published books to bookstores, and, perhaps most importantly, to the emerging digital book channels that represent the future of book publishing. Microsoft could have chosen to partner with any of the major computer book publishers. That they chose to work with us is a testament to three advantages we bring to the business:

  1. O'Reilly is more than a book publisher. We are an advocate, a connector, and a community builder. We help developers and users make the most of technology, with a focus on what they need to know. Microsoft has a history of building great developer communities, but in today's world, those communities need to be connected with other communities outside Microsoft. Especially in technology, "the world is flat."
  2. O'Reilly plays a unique role in the technology ecosystem: from our earliest days, we provided the documentation for important technologies for which there was no "vendor." The internet, the World Wide Web, Linux and other open source software, and Web 2.0 all were documented and given mainstream awareness by O'Reilly books and events. We identify and evangelize the disruptive technologies that reinvigorate the industry.
  3. O'Reilly has been a pioneer in the new world of ebooks. In the early 1990s, we co-developed docbook, one of the first standardized formats for ebooks, and the progenitor of future XML-based ebook formats. In 2001, in partnership with the Pearson Technology Group, we launched Safari Books Online, the largest and most comprehensive electronic subscription library of computer books and videos. We've built a successful direct business with DRM-free downloads of ebook bundles that work on any device. We're an early leader in publishing books for the iPhone and other portable reading devices, and understanding how to use ebook channels to reach new customers. And of course, our Tools of Change for Publishing Conference (TOC) has become the place to share knowledge about the changes sweeping through publishing.
On this last point, I'm particularly excited that as part of this agreement, Microsoft has committed to make its ebooks DRM-free and device-independent. One of our goals at O'Reilly has been to make sure that ebook customers can read them on any device, and have the ability to keep using them even if they change their preferred device. Having Microsoft Press join us in this commitment is a big step forward towards an open ebook market.

tags: drm, microsoft, oreilly media, publishingcomments: 32
submit: Reddit Digg stumbleupon   

 

Wed

Sep 23
2009

Andy Oram

Worldwide Lexicon: matching up technologies and culture to end the language barrier

by Andy Oram@praxagoracomments: 5

I've reported before on the Worldwide Lexicon, the brainchild of my friend Brian McConnell. His most recent breakthrough, which I blogged about in August, was an impressive Firefox plugin that exploits both human and machine translations on the Web to provide pages you can read in your primary language.

As attractive as the Firefox plug-in can be, it's only the first stage in four that Brian plans toward a computing environment that encourages and leverages human translation. On the browser side, the next logical project is to reproduce the Firefox experience for IE users. Ultimately, he hopes the functionality becomes a standard part of every browser. Even better, he's working on a way to include the functionality on the server side so that it's browser-independent (although that technology would require support in the server software, of course).

And there's even more to come. He lays out his vision in an essay boldly titled The End Of The Language Barrier. The bottom of the article points to an equally important statement written for the World Economic Forum by Ethan Zuckerman, founder of the Global Voices site that extends the reach of weblogs to people in many countries who previously lacked access to such forums.

(continue reading)

tags: Brian McConnell, community, crowdsourcing, documentation, Ethan Zuckerman, Firefox add-on, Global Voices, language, peer production, polyglot, publishing, translation, wealth of networks, wisdom of crowds, World Wide Lexicon, WWLcomments: 5
submit: Reddit Digg stumbleupon   

 

Tue

Aug 25
2009

Andy Oram

World Wide Lexicon Toolbar changes the reading experience for the other 99% of web pages

by Andy Oram@praxagoracomments: 8

Brian McConnell's latest coding effort, World Wide Lexicon Toolbar, meets my criterion for a piece of critical infrastructure: after two days with it I can't get along without it, and I plan to avoid any browser that doesn't have it installed.

Brian is a highly adaptive programmer. With roots in the telecom industry and several start-ups on his resume, he also wrote Beyond Contact: A Guide to SETI and Communicating with Alien Civilizations for O'Reilly. The World Wide Lexicon project he's been working on for the past several years is again something totally different.

Install the add-on (currently experimental) in Firefox 3.5 or higher and visit a page in some language other than your default. Before your eyes, headings and text change into your native language. You can get similar effects by submitting the page to a popular translator such as Google (which is one of the tools used behind the scenes by the WWL toolbar), but the instantaneous effect of the toolbar makes you feel closer to the people whose sites you visit around the world.

There are several languages that I know well enough to get the gist of a page, but where I miss some of the details and get frustrated by gaps in my vocabulary. Therefore, I set the WWL toolbar to "Bilingual view," so each block element of the original text is shown together with its translation. The bilingual view is considerably less attractive, because it swells the size of each block element, but I can tell already that it will improve my language skills quickly.

WWL is designed for volunteer translations. If it becomes more popular, people will submit translations that are much more accurate than the machine-generated ones the WWL must fall back on currently.

What's the process behind this new dimension to web browsing? McConnell let me in on some of the magic.

Volunteer translations

McConnell invented WWL several years ago with the core notion of encouraging people to translate web pages they thought should get a wider audience. When he first told me about the idea, I was skeptical that he would get many volunteers. But then I heard of other volunteer translation efforts. For instance, there's a whole subculture of people who write subtitles for popular Hollywood films. This runs afoul of copyright law, of course (and so do the copies of movies they're attached to, probably) but they show the lengths to which crowdsourcing has progressed in the translation area.

FLOSS Manuals, a project I do volunteer work for, also finds dozens of people willing to translate its open source documentation.

McConnell's first set of tools were designed to facilitate on-the-fly translations. Web designers could enhance their web sites by downloading from the WWL site some JavaScript that made each text element on the page editable. (I blogged about this in December 2007.) The paste-in displayed a little pencil icon, signaling to viewers that they could do instant translations. All they would have to do was click on an element, and a text box would pop up where they could enter their translation. The web site would then register the translation with the central WWL site.

World Wide Lexicon API

The WWL API covers the entire life cycle of a translation: registering a translation, rating translations for quality, searching for a translation of a particular page into a particular language, and retrieving a translation. Queries can specify a minimum rating.

Toolbar

The latest achievement of the WWL project is the toolbar officially released yesterday. It determines the user's native language through settings in the browser. When each page is visited, the toolbar uses the domain name and various tests on the text to make a guess about its language.

The toolbar then issues an API query to see whether any human translations exist. If so, it displays the translations with a light yellow or green background.

If no one has made a human translation (which is usually the case so far) the toolbar resorts to well-known machine translation services. It can make use of Google Translate, Apertium, and Moses, each of which offers an API, and will also query Babelfish when its API is ready. Machine translations are displayed with a light blue or grey background.

The progressive translation used by the toolbar is also interesting. It starts with the first 10 or 20 elements, then translates heading tags (<H1>, etc.), then the larger texts, and ultimately every element on a page. (I displayed one page that embedded a Google ad, and the translator recognized and translated that text too.) McConnell is working on making the various translations run in parallel. Because translation changes the sizes of elements, the toolbar makes various accommodations to display the page as attractively as it can.

In short, WWL is a cool combination of mash-ups, existing services, crowdsourcing, and Ajax. I'm sure that in a year's time I'll think back to its appearance today and be shocked at how primitive it was. But it will remain a transformative tool for me.

tags: Brian McConnell, community, crowdsourcing, documentation, Firefox add-on, peer production, publishing, wealth of networks, wisdom of crowds, World Wide Lexicon, WWLcomments: 8
submit: Reddit Digg stumbleupon   

 

Tue

Aug 25
2009

Nat Torkington

Four Short Links: 25 August 2009

Reverse Search, PDF Stripping, Flash Visualization, Failure

by Nat Torkington@gnatcomments: 1

  1. Tineye -- reverse search engine; you upload an image and they find you similar images so you know where else it's used. Check out their cool searches.
  2. PDF Pirate -- upload a PDF and this web site will give it back to you minus the restrictions on copying/printing/etc.
  3. Flare -- an ActionScript library for creating visualizations that run in the Adobe Flash Player. BSD-licensed, modelled on Prefuse. When there's a visualisation library for every platform, will we start to get people who know how to make them?
  4. The Importance of Failure (Marco Tabini) -- This is a point that I don't often hear made when people talk about failure; the moral behind a failure-related story is usually about preventing it, or dealing with the aftermath, but not about the fact that sometimes things go bad despite your best efforts, and all the careful risk management and contingency planning won't keep you from going down in flames. This is important, because it forces every person to establish a risk threshold that they are willing to accept in every one of their life efforts.

tags: drm, failure, failure happens, flash, publishing, search, visualizationcomments: 1
submit: Reddit Digg stumbleupon   

 

Fri

Aug 14
2009

Nat Torkington

Four short links: 14 August 2009

EPub FTW, SQL Horror, Computer Vision Explained, and A Massive Dump of Twitter Stats

by Nat Torkington@gnatcomments: 1

  1. Page2Pub -- harvest wiki content and turn it into EPub and PDF. See also Sony dropping its proprietary format and moving to EPub. Open standards rock. (via oreillylabs on Twitter)
  2. SQL Pie Chart -- an ASCII pie chart, drawn by SQL code. Horrifying and yet inspiring. Compare to PostgreSQL code to produce ASCII Mandelbrot set. (via jdub on Twitter and Simon Willison)
  3. How SudokuGrab Works -- the computer vision techniques behind an iPhone app that solves Sudoku puzzles that you take a photo of. Well explained! These CV techniques are an essential part of the sensor web. (via blackbeltjones on Delicious)
  4. Twitter by the Numbers -- massive dump of charts and stats on Twitter. I love that there's a section devoted to social media marketers, the Internet's head lice. (via Kevin Marks on Twitter)

tags: book related, computer vision, ebooks, fun, iphone app, publishing, sql, statistics, twittercomments: 1
submit: Reddit Digg stumbleupon   

 

Mon

Aug 10
2009

Nat Torkington

Four short links: 10 August 2009

Propaganda, Computer Science, Web Science, CS History

by Nat Torkington@gnatcomments: 0

  1. The Propaganda Newspapers -- London councils increasingly providing their own newspapers, masquerading as mass-market popular appeal newspapers but without anything critical of the council that produces it. This is an evolutionary dead-end for reinventing newspapers, and is why the non-profit/trust structure works so well.
  2. Time for Computer Science to Grow Up -- publish in journals so conferences can be community events. I've seen academics at Sci Foo look around at the unconference structure, or lightning talks, and say "why can't my normal conferences be like this?!", and not just in computer science too. Science conferences need a heart transplant. (via David Pennock)
  3. Science Online 2010 -- conference on science and the Web. Our goal is to bring together scientists, physicians, patients, educators, students, publishers, editors, bloggers, journalists, writers, web developers, programmers and others to discuss, demonstrate and debate online strategies and tools for doing science, publishing science, teaching science, and promoting the public understanding of science. (via kubke on Twitter)
  4. E.W. Dijkstra Archive -- a collection of over 1,000 manuscripts that EWD sent around during his career. EWD 1036, "On the cruelty of really teaching computing science". "From a bit to a few hundred megabytes, from a microsecond to a half an hour of computing confronts us with completely baffling ratio of 109" (via S. Lott)

tags: education, events, history, newspapers, people, publishing, science, webcomments: 0
submit: Reddit Digg stumbleupon   

 

Tue

Aug 4
2009

Nat Torkington

Four short links: 4 August 2009

NASA Cloudware, btrfs, eBook Editing, Exponential Death

by Nat Torkington@gnatcomments: 1

  1. NASA Nebula Services/Platform Stack -- The NEBULA platform offers a turnkey Software-as-a-Service experience that can rapidly address the requirements of a large number of projects. However, each component of the NEBULA platform is also available individually; thus, NEBULA can also serve in Platform-as-a-Service or Infrastructure-as-a-Service capacities. Bundles RabbitMQ, Eucalyptus, LUSTRE storage, Fabric deployment, Varnish front-end, MySQL and more. (via Jim Stogdill)
  2. A Short History of btrfs -- Now for some personal predictions (based purely on public information - I don't have any insider knowledge). Btrfs will be the default file system on Linux within two years. Btrfs as a project won't (and can't, at this point) be canceled by Oracle. If all the intellectual property issues are worked out (a big if), ZFS will be ported to Linux, but it will have less than a few percent of the installed base of btrfs. Check back in two years and see if I got any of these predictions right!
  3. Sigil -- open source WYSIWYG eBook editor. (via liza on Twitter)
  4. Exponential Decay of Life -- This startling fact was first noticed by the British actuary Benjamin Gompertz in 1825 and is now called the “Gompertz Law of human mortality.” Your probability of dying during a given year doubles every 8 years. For me, a 25-year-old American, the probability of dying during the next year is a fairly miniscule 0.03% — about 1 in 3,000. When I’m 33 it will be about 1 in 1,500, when I’m 42 it will be about 1 in 750, and so on. (via Hacker News)

tags: bio, cloud computing, data, ebooks, math, publishing, storagecomments: 1
submit: Reddit Digg stumbleupon   

 

Mon

Aug 3
2009

Nat Torkington

Four short links: 3 August 2009

Mathematics Collaboration, Risk, Visualisation, and SemWeb

by Nat Torkington@gnatcomments: 0

  1. Enabling Massively Parallel Mathematics Collaboration -- Jon Udell writes about Mike Adams whose WordPress plugin to grok LaTeX formatting of math has enabled a new scale of mathematics collaboration.
  2. 2845 Ways to Spin The Risk -- introduction to the ways in which our perception of risk (and numbers in general) can be distorted by how it is presented. (via titine on Twitter)
  3. Logstalgia -- OpenGL app to visualize Apache log files.
  4. 4Store -- "scalable RDF storage". 4store was designed by Steve Harris and developed at Garlik to underpin their Semantic Web applications. It has been providing the base platform for around 3 years. At times holding and running queries over databases of 15GT, supporting a Web application used by thousands of people. (via joshua on Delicious)

tags: brain, collaboration, crowdsourcing, database, math, publishing, semantic web, visualizationcomments: 0
submit: Reddit Digg stumbleupon   

 

Wed

Jul 15
2009

Mark Drapeau

Bantamweight Publishing in an Easily Plagiarised World

by Mark Drapeau@cheeky_geekycomments: 10

Even professional writers are prone to infrequent accidental plagiarism. But in the world of novels, newspapers, and college exams, there are rules about bootlegging others’ work that are well-established - most everyone agrees on what behaviors are unacceptable and what the consequences are. In bantamweight publishing, however, the rules are not so clear.

In order for the British Army to raise more units during the First World War, it created battalions of otherwise healthy men with lowered minimum height requirements. In this way, short, powerful miners and similarly swarthy individuals were able to contribute to the war effort. These soldiers were called bantams (a term now heard most commonly in boxing, bantamweight). Similarly, in a Web 2.0 environment, the short powerful bursts of searchable, findable, and sharable data emitted from personal electronic devices are a form of bantamweight publishing in which persons outside the regulated publishing industry can contribute to the information sharing effort.

Bantamweight publishing comes in many forms. Twitter is certainly in this category, but there are a steadily increasing number of ways to share small bits of information with the world. From updating your Facebook Wall to Yammering inside your enterprise to updating your LinkedIn status to commenting on people’s BrightKite locations, everyone is doing it. But in an easily plagiarized world, who owns your sentences once you publish them? It’s not really clear. And in a murky environment where someone might get a macropublishing book deal by popularizing someone else’s creative hashtag, bantamweight publishing runs the risk of serious future problems.

Oh, bantamweight publishing has its customs. Self-policing crowds ensure that most people who lift someone else’s excellent quote or funny picture or news link give credit to the originator using the “retweet” (RT) convention followed by a username. But there is little downside to cheating relative to being expelled from college or fired from your newspaper. As is well known in animal behavior circles, it can be temporarily advantageous for cheaters to infiltrate a system like this.

To be sure, quoting someone’s original haiku verbatim and making it appear as if it were your own is an infraction of bantamweight publishing customs. But what if someone tweets an Abraham Lincoln quotation - must the re-tweeter cite the originator? The custom seems less pressing in this case, mainly because of a lack of intent to deceive and arguable "fair use" of a well-known statement by a famous person. One can imagine altruistic plagiarism as well, where people repeat memes to raise money for charity, or virally make people aware of an immediate Amber alert. Further, who could fault someone for copying information about a charity onto their Facebook Wall without citing the originator? In the bantamweight publishing world, information sharing can easily supersede attribution. There are gradations of citations.

Bantamweight publishing is popular among those who feel brevity is a virtue. But when an entire work of art is bounded in 140 characters, even brevity has its limits. Sometimes, squeezing in a proper attribution through editing content can change the original meaning, when the edits unwillingly shift from cosmetic to substantive. And what happens when you run out of space when attempting to retweet someone who retweeted someone who tweeted an important quotation from the Washington Post? To a large degree, a work of bantamweight publishing is like a painting with an upper weight limit, where the novelty is the canvas and the attribution is the frame; most viewers would choose to appreciate the canvas without the frame if given the hard choice.

Another major difference between regular publishing and bantamweight publishing is the lack of research and editing standards. Sometimes people attribute flawed information properly. It is obvious that excellent curators of information like NYU professor Jay Rosen and publisher Tim O’Reilly are exceptions to the rule, based simply on the phenomena of Rick Rolling, #moonfruit, and celebrity death hoaxes. To many, bantamweight publishing is not an micro-investigatory piece to be researched, sourced, edited, and spread, but rather a form of enhanced social chatter and gossip spreading. And according to the rules of gossip, it doesn’t really matter where it comes from; gossip is fun.

Few would argue that the British bantam units were a bad idea, and likewise bantamweight publishing has many virtues. But there are also pitfalls to this in an easily plagiarized world, particularly when money comes into play. Who’s looking out for the intellectual property of a winning hashtag that becomes a book, or a stream of haikus that becomes a blog that companies advertise on? At some point, bantamweight publishing will no longer be a lawless frontier territory; what will it look like next?

tags: emerging tech, publishing, twitter, web 2.0comments: 10
submit: Reddit Digg stumbleupon   

 

Wed

Jul 1
2009

Nat Torkington

Four short links: 1 July 2009

Web Awards, Speed Thrills, Magazines in the Cloud, Augmented Reality

by Nat Torkington@gnatcomments: 0

  1. The Onyas -- New Zealand web design awards launch, from the people behind Webstock and Full Code Press. The name comes from "good on ya", the highest praise that traditionally taciturn New Zealanders are allowed by law to give.
  2. The Year of Business Metrics: Don't make your users run away! -- wrapup of the Velocity conference. AOL: Users who had a slower experience view far fewer pages. Some interesting notes on performance from a Google-Bing study: Notice that as the delays get longer the Time To Click increases at a more extreme rate (1000ms increases by 1900ms). The theory is that the user gets distracted and unengaged in the page. In other words, they've lost the user's full attention and have to get it back. [...] As much as five weeks later, some users, especially those who saw delays greater than 400MS, were still searching less than before. (via timoreilly on Twitter)
  3. Printcasting -- very simple content management system for print magazines that lets anyone start a magazine, add content, sign up contributors, sell ads, and go. Clever!
  4. Pachube Augmented Reality Hack -- sexy hack that pushes all my buttons: computer vision, Arduino, sensor network, ubiquitous computing, pervasive alternate reality cyborg villians with chalk designs hellbent on world domination and the enslavement of the human race to use as meatsack AA batteries for their sex toys. Okay, four out of five ain't bad. (via bruces on Twitter)

Pachube Augmented Reality Demo

tags: award, computer vision, hacks, performance, print on demand, publishing, sensor networks, velocity09, webcomments: 0
submit: Reddit Digg stumbleupon   

 

Wed

Jun 24
2009

Tim O'Reilly

My 140conf Talk: Twitter as Publishing

by Tim O'Reilly@timoreillycomments: 6

I spoke at Jeff Pulver's 140conf a few weeks ago. My subject was the continuity of what I do, from publishing through conferences through my presence on twitter. I tried to draw the connections, and to explain how "social media" means drawing from, curating, and amplifying the voices of a community. I suggest that the role of an editor and publisher is analogous to the role of a point guard in basketball, handing out "assists" and improving the performance of his or her teammates. After all, I point out, I couldn't possibly tweet enough to cover all the topics I am interested in. But by using my retweets to build the visibility of others, I can create and foster a community that cares about the ideas, trends, and people that I care about.

My talk starts about 1:40 into the video, after a few comments from Jeff Pulver, the conference organizer. I've provided a lightly edited and linkified transcript below, for those of you who don't have time to watch the entire 15 minute video. If you do have the time, you can watch the video from the entire two-day conference at http://www.140conf.com/watchit.

What I learned from Twitter

Hi. I want to talk to you a little bit about Twitter and media. I'm a publisher. I'm a publisher in print. And it turns out I'm also a publisher on Twitter. I want to explain the roots of media and how that connects with what we're doing in this newest form of media.

When you think about the original use case of Twitter, which @Leisa described so wonderfully as “ambient intimacy,” it's really news from your close friends. But it's news nonetheless. And sometimes the news from individuals becomes news that matters to a whole lot more people. When someone in Tehran today is reporting their personal news, it's news that matters to all of us. And so you can see the continuum between the personal and the international in those moments.

But that continuum exists all the time, and it's existed always in media.

(continue reading)

tags: 140conf, publishing, twittercomments: 6
submit: Reddit Digg stumbleupon   

 

Recent Posts