Wed

Sep 28
2005

Tim O'Reilly

Tim O'Reilly

NY Times Op Ed on Author's Guild Suit Against Google

My NY Times Op Ed piece about Google Library, cleverly entitled Search and Rescue by the headline editors at the Times, ran this morning. Interestingly, I was contacted by the Times to write this Op Ed as a result of my previous blog posting about Google Library.
 

I was delighted to see that the TImes op-ed agreement allows for republication of the piece, as long as acknowledgment is given. So the full text as it appeared this morning is given below.

Search and Rescue
 

AUTHORS struggle, mostly in vain, against their fated obscurity. According to Nielsen Bookscan, which tracks sales from major booksellers, only 2 percent of the 1.2 million unique titles sold in 2004 had sales of more than 5,000 copies. Against this backdrop, the recent Authors Guild suit against the Google Library Project is poignantly wrongheaded.

The Authors Guild claims that Google's plan to make the collections of five major libraries searchable online violates copyright law and thus harms authors' interests. As both an author and publisher, I find the Guild's position to be exactly backward. Google Library promises to be a boon to authors, publishers and readers if Google sticks to its stated goal of creating a tool that helps people discover (and potentially pay for) copyrighted works. (Disclosure: I am a member of the publisher advisory board for Google Print. As the name implies, it is simply an advisory group, and Google can take or leave its suggestions.)

What's causing all the fuss? Google has partnered with the University of Michigan, Harvard, Stanford, the New York Public Library and Oxford University. Google will scan and index their library collections, so that when a reader searches Google Print for, say, "author's rights," the results point to books that contain that term. In a format that resembles its current Web search results, Google will show snippets (typically, fewer than three sentences of text from each page of each book) that include the search term, plus information about the book and where to find it. Google asserts that displaying this limited amount of content is protected by the "fair use" doctrine under United States copyright law; the Authors Guild claims that it is infringement, because the underlying search technology requires a digitized copy of the entire work.

I'm with Google on this one. It would certainly be considered fair use, if, for example, I circulated a catalog of my favorite books, including a handful of quotations from each book that helps people to decide whether to buy a copy. In my mind, providing such snippets algorithmically on demand, as Google does, doesn't change that dynamic. Google allows click-through to the entire book only if the book is in the public domain or if publishers have opted in to the program. If it's unclear who owns the rights to a book, only the snippets are displayed.

A search engine for books will be revolutionary in its benefits. Obscurity is a far greater threat to authors than copyright infringement, or even outright piracy. While publishers invest in each of their books, they depend on bestsellers to keep afloat. They typically throw their products into the market to see what sticks and cease supporting what doesn't, so an author has had just one chance to reach readers. Until now.

Google promises an alternative to the obscurity imposed on most books. It makes that great corpus of less-than-bestsellers accessible to all. By pointing to a huge body of print works online, Google will offer a way to promote books that publishers have thrown away, creating an opportunity for readers to track them down and buy them. Even online sellers like Amazon offer only a small fraction of the university libraries' titles. While there are many unanswered questions about how businesses will help consumers buy the books they've found through a search engine for printed materials that is as powerful as Google's current Web search, there's great likelihood that Google Print's Library Project will create new markets for forgotten content. In one bold stroke, Google will give new value to millions of orphaned works.

I'm sorry to see authors buy into the old-school protectionism of the Authors Guild, not realizing they're acting against their own self-interest. Their resistance can come only from a failure to understand the nature of the program. Google Library is intended to help readers discover copyrighted works, not to give copies away. It's a tremendous service to authors that will help them beat the dismal odds of publishing as usual.

Unfortunately, space constraints forced one key paragraph to be cut. Since that paragraph seems to me to be the one that captures the heart of my support for Google's position, I'll give it to you here:
Google is also solving a huge problem for the publishing industry. Because no one knows who owns many of the works in question, Google's innovative deal with libraries is the only practical approach. It sweeps up all the loose ends of forgotten rights and ignored works. As the public discovers the value of these works, publishers and authors are incentivized to track down and assert their ownership in order to opt-in to the revenue sharing offered by the Google Print service.
P.S. It takes a lot of wordsmithing to get a complex issue like this down to 750 words, and I appreciated the help of the NY Times op ed editor, Eric Etheridge, plus Sara Winge, Roberta Cairney and Kevin Kelly, who read and commented on my drafts.

tags:   | comments: 14   | Sphere It
submit:

 
Previous  |  Next

0 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/4305

Comments: 14

  Deirdre' Straughan [09.29.05 12:54 AM]

A concrete example:

I was working on a very obscure topic - a history of Woodstock School in India - when Amazon's "Search Inside the Book" feature became available. Searching for "Mussoorie," the name of the town where the school is located, I found several mentions in Dane Kennedy's "The Magic Mountains," a scholarly book about British hill stations in India. The Amazon feature allowed me to look at the relevant pages, determine that the book would be a useful addition to the school's library of historical material, and buy it for the school. Professor Kennedy got a sale he otherwise wouldn't have. Everyone benefited.

The Google Library would be enormously useful in this project, and I look forward to using it!

  J.T. Wenting [09.29.05 09:07 AM]

Maybe, if it is (and remains) limited to that.
But I fear that that's not to be and that in short order you'll find yourself able to retrieve the entire content of tens of thousands of in-print books online without a dime being paid to the copyright holders.
Google as always state they can change the terms of their TOS at any time without prior notice, and even worse force you to accept these TOS because they don't even try to contact you to gain permission for using your work.
Instead authors and publishers will have a new fulltime task which is to monitor Google Library to see if their work is now being plagiarised by the company, only to find that the TOS read their failure to indicate they didn't want it to happen a year in advance allows Google to do so...
The TOS may not read so now, but does allow Google the leeway to change it to read like that at any time.

  Watts [09.29.05 01:41 PM]

J.T., they could rewrite the TOS to say that by doing any Google search that returns a result which contains the word "kumquat," they get to break your kneecaps. But they couldn't enforce that. You don't get to break the law by explicitly giving yourself written permission to do so.

The issue here is simply fair use. Whether it's fair use for Google to return a few sentences of a book in a search is a diferent legal issue than whether they can return the whole book: the last case is clearly a copyright violation. They don't get to legally violate copyright by putting it in the TOS any more than they'd get to legally break kneecaps.

  Arzach [09.29.05 08:59 PM]

Anyone who read O'Reilly's bit and can trust that what he says about the manner in which Google plans on making the books scanned available as a reasonable facsimile of what's going in here should be either SHOCKED that such an incredible body of knowledge should be indexed thusly, and SHOCKED that it might actually benefit both authors and readers in a manner NEVER BEFORE SEEN by humankind and therefore AFRAID that people, literate humans might actually benefit from this sharing of knowledge *-OR-* you'll just be amazed that this kind of thing is really getting of the ground and completely happy that, on some level, we're making progress towards a better understanding and knowledge of ourselves.

How you react will ultimately mark you as a neo-luddite or part of the next generation.

  Tim O'Reilly [09.30.05 10:30 PM]

Someone asked me in email whether or not I own any Google stock.

The answer is that I don't. If I did, I would have disclosed that fact in my op ed, just like I disclosed that I was on the Google Publisher Advisory Board (which, by the way, is an uncompensated position. It's also a position that I share with a bunch of publishers, many of whom don't share my views on Google Library (at least they didn't, though I think I'm starting to persuade a few of them.))

I did own Google stock in the IPO, due to our investment in Pyra Labs (blogger.com), which was sold to Google, but I sold all my shares earlier this year. And I should add that I didn't sell them because of any opinion about Google's future direction or potential. It's just that over the years I've found that owning shares of individual stocks is too stressful for me. I don't really pay attention, until the stock makes a sudden move. If it's up, you get unnecessarily excited, and if it's down, unnecessarily depressed at having missed the right time to sell. I've tried a variety of strategies over the years, and ended up finding that I was most comfortable having a money manager worry about what to invest in, so I can focus on other things that are more important to me.

  analogAI [10.21.05 10:52 AM]

The business model has been changing with time, similarly with other content industries. The content became cheaper with newer technologies; prevalence of free works and lower publishing barriers, increasing the supply in a supply demand model. The question now is what can we do for the reader finding what to read, and for the publisher to get the attention span of the readers. I'm a researcher, and I have limited amount of time. If I can't find something quickly from a dead tree book I'll turn to the internet to find a book that is indexed, buy that other book instead. Help me save time and I will help you get the book sale revenue.

  Sunwolf [10.23.05 04:32 PM]

I wonder, though; would it be possible to make a program that would run a search on Google Print, grab the three sentences, run searches on those words, grab the next three sentences, and so on until the entire text is compiled?

  Nick P [10.24.05 03:10 AM]

Sunwolf - I thought that very thing and went out and tried it (print.google.com IS up after all), and certain pages are not available so it would be of very limited use. I'm not sure exactly how the system should work - limit based on successive searches you've done (e.g. through a logged in account) vs absolute limit on certain parts of books - because each has its benefits and drawbacks. Nevertheless, the absolute limits on content of certain pages means nobody can get a full book off of G!P.

  Andrew S [10.27.05 12:42 AM]

I believe that the authors and publishers realize that in the long run, having their work being easily acccessible and searchable is to their benefit. They aren't dumb! Your column is largely about this, but misses the dollars and cents:

What else do the authors and publishers see? They see that Google has become a hundred BILLION dollar company based largely on their selling ads on top of snippets of web sites. They see that Google would like to do the same with books, and perhaps gain another few BILLION dollars.

Let me put forth this scenario:

Let's say some company offered you, an author, an additional one thousand dollars if you gave them rights to scan in your book. Then, they make a hundred thousand dollars off of having your book online. Wouldn't you feel unjustly treated? Ripped off? Or feel like you could have gotten a better deal elsewhere?

Now, add in that they didn't even *ask* for permission to scan in and OCR your book. Now how would you feel?

Even though you have the extra thousand dollars, you still feel like a stooge.

If Google's goal is to make as much money as possible off of authors, then they are heading down the right path. If their goal is to truly spread knowledge to as many people as possible, they should add their efforts to the Open Content Alliance.

  Adam [11.03.05 08:47 AM]

It seems clear that Google does not seek to make money on the talents or property of the authors. Google Print provides a service, and the profits will be due to the service--not the work of the authors. The authors didn't create the service. Why do they insist on making money from the hard work of Google engineers? By the way, aren't websites copyrighted as well? Why are search engines able to provide text snippets from the website if it is copyrighted?

  Angelle [11.11.05 12:04 AM]

Google Print provides a service, and the profits will be due to the service--not the work of the authors.

Adam,

The only way for Google to provide this "service" as you call it is by scanning the authors' copyrighted intellectual properties. No content = no service.

So I think that the authors should be entitled to something.

  Mary [12.12.05 11:51 AM]

i just read wsj.com news "Harper collins plans to control its digital books" (12/12/05) by charging $0.10 per page. i've read some other online commentary -- all web publisher product and blogs. i kept searching for discussion of (a) impact of AdSense sales/ print.google 'fair use' of author's IP (b) harvesting existing dbases, e.g. ISBN, US copyright office, Lib. Congress freaking dewey decimal (c)channel cannibalism. some one pls show me the way -- if i can get 3 sentences for free that precisely fit my, er, limited research time -OR- jump to some advertiser's digest, why the hell would i go to a library or buy a book?

  Mary [12.12.05 12:17 PM]

P.S. Adam, i can't imagine what you do for a living. i suspect you are a 'consumer', not a 'producer'. these are old-fashion labels, to be sure. nonetheless, think about how technology transforms the value of your work and the work of someone who invests TIME in an original work and why they do so. Andrew and Angelle make this point in different ways. it isn't always money or 'long-tail' obscurity that motivate an author. i fear, your respect for google engineers is misplaced. pity the algorithm or the dud(ette) at the ocr scanner earning $7/hr -- no options (?). google profits derive from diminishing marginal cost of reproducing all or part of someone else's work. also, pity the frailty of us copyright law. yes, even a website constitutes copyright of the author. but copyright is no guarantee; only price or litigation guarantees copyright. see US copyright law sect 107 in particular, if you have the time.

  news [10.07.06 10:38 AM]

Sunwolf - I thought that very thing and went out and tried it (print.google.com IS up after all), and certain pages are not available so it would be of very limited use. I'm not sure exactly how the system should work - limit based on successive searches you've done (e.g. through a logged in account) vs absolute limit on certain parts of books - because each has its benefits and drawbacks. Nevertheless, the absolute limits on content of certain pages means nobody can get a full book off of G!P.

Post A Comment:

 (please be patient, comments may take awhile to post)






Type the characters you see in the picture above.

RECOMMENDED FOR YOU

RECENT COMMENTS