OCA vs. Google Print Library Project?

I was struck in the recent news about Microsoft joining the Open Content Alliance by the curious framing of this announcement:

The move comes as Google faces growing legal pressure from publishers over its own global digital library plans.

Microsoft said it would initially focus on works already in the public domain.

This way, it hopes to avoid similar legal issues over copyright.

This PR positioning makes me think that the OCA, a worthwhile effort (to which O’Reilly has contributed content), is being hijacked by Microsoft as a way of undermining Google. In fact, the OCA addresses only a subset of the “lost content” problem in print book publishing that is addressed by Google Print for Libraries.

According to a recent study by the Online Computer Library Center, which analyzed the books in the collections of the five libraries participating in the Google Print for Libraries project, only about 20% of the 10.5 million unique titles in the collections of the five libraries are out of copyright, using the 1923 change in the copyright law as a dividing line before which you can assume books are out of copyright. This 20% of books out of copyright is the realm of efforts like OCA. Meanwhile, another 10-20% are under copyright, in print, and being commercially exploited. This is the realm of titles opted in by publishers to programs like Google Print or Amazon Search Inside the Book. That leaves 60-70% of all titles ever published in the twilight zone, out of print, but still under copyright. For many of these books, no one even knows any longer who owns the rights, and there is no commercial incentive to figure it out, making the publishers’ request for “opt in” a fig leaf that will ultimately lead only to continued neglect.

3groupsofbooks.tiff

As I’ve written previously, Google Print is the only effort that attempts to cut the Gordian knot that entangles titles that are under copyright but no longer being commercially exploited. Working with libraries to build a searchable index of their collections is a brilliant application of the principles of copyright fair use that will unlock the vast number of books in this middle category. What’s so beautiful about this approach is that as search helps users to rediscover value in these “lost works”, publishers and authors will have an economic incentive that is missing under the current situation to discover and assert their ownership. OCA is a complementary effort, but it does not at all address the same problem.