Scholarship and Mass Digitization

I was delighted to hear today that the Council on Library and Information Resources (CLIR) has just received a small grant from the Mellon Foundation to study the utility of major mass digitization projects such as Google Book Search, Microsoft Live Book Search, and the Open Content Alliance for scholarship.

To paraphrase some of their supporting grant documentation (not presently online):

We are now either at or very close to the point where the body of originally analogue material now in digital form is of such quantity and quality that we must facilitate the design and operation of broad-scale distributed digital libraries. Looking closely at the quality and functionality of these projects for scholars is vital to making sure that operationalization supports these important social uses, ones that Google and others are not likely have as first priority.

The proposal continues:

[O]bservers variously contend that large-scale projects such as Google’s and Microsoft’s will enable new discovery of literary and other works not currently accessible to the public, will democratize knowledge, and will contribute to the public good in unprecedented ways. Others fear that the information held in these projects will be eventually sold as a commodity, decreasing access to it for the less affluent. Given the extraordinary costs associated with these projects, there may not emerge any competition to Google or Microsoft, and the market will be thus tightly held as a near monopoly.

Commercial mass digitization projects have been designed to maximize the ability to discover and retrieve as much information as intellectual property rights and licensing permit, but with the ulterior and necessary motive of delivering revenue generation through advertising and transactional support. Google and Microsoft have not included scholars in the design of these services- except to the extent that they might assert that their software engineers are adequate proxies for university faculty in departments beyond those in CS/EE – and therefore the more specific needs of scholars, researchers, and teachers, are not likely to be addressed.

CLIR’s project will have several aims:

  1. Assess selected large scale digitization programs by exploring their efficacy and utility for conducting scholarship, in multiple fields or disciplines (humanities, sciences, etc.).
  2. Write and issue a report with findings and recommendations for improving the design of mass digitization projects.
  3. Create a Collegium that can serve in the long-term as an advisory group to mass digitization efforts, helping to assure and obtain the highest possible data quality and utility.
  4. Convene a series of meetings amongst scholars, libraries, publishers, and digitizing organizations to discuss ways of achieving these quality and design improvements.

This is an interesting effort, and it will be fascinating to read the report that it ultimately generates.

tags: ,