Sun

Sep 30
2007

Peter Brantley

Peter Brantley

Making a Brouhaha in the Blogosphere

Two weeks ago, Carl Malamud and I wrote to the U.S. Copyright Office seeking the release of their copyright registration database to the public without restriction. The letter, co-signed by prominent librarians and legal experts, asserts that the copyright catalog of monographs, documents, and serials should be freely available; it is a public resource, the fuel driving the copyright system itself.

Presently, the Copyright Office charges $55,125 to obtain the retrospective online database, and $31,500 for a current-year subscription that must be annually renewed, for an entry cost of $86,625. Copyright records are available for free only on what the Copyright Office calls a "record-oriented" interface, which has the functionality one would expect of an IBM 3270 terminal emulator dressed up in a style sheet.

In a voicemail that Marybeth Peters, the U.S. Register of Copyrights, left for Carl Malamud, Ms. Peters clarified that there is no copyright on any of the Copyright Office records; that they are "public records" and they should be "openly available." Ms. Peters identified the Library of Congress' Cataloging Distribution Service (CDS) as the unit responsible for providing access to the database; the CDS asserts it was mandated by the U.S. Congress to provide this service "at a charge of production and distribution cost plus 10%." Carl and I have learned they have only two customers for this particular "product" and we don't quite get the business model behind this constitutionally-mandated database.

The Library of Congress has responded to our request to fully release the database solely by describing it as "a bit of a blogospheric brouhaha over what the Library of Congress charges."

We're sympathetic with the desire of the Library to raise revenues, but this product isn't theirs to sell. This is a public resource and all 21 million records of the database are now available in bulk, without restrictions [http | ftp].


tags: publishing  | comments: 9   | Sphere It
submit:

 
Previous  |  Next

0 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/5880

Comments: 9

  Brian Vargas [09.30.07 05:04 PM]

You say, "this product isn't theirs to sell." Perhaps it's not theirs - and by implication belongs to the tax-paying public - but nevertheless Congress has directed them to sell it. Period. End of story.

If you don't like it, take it up with your Representatives and Senators.

*Fine print: Though I happen to be a contractor for the Library of Congress (in an area quite unrelated to copyright), these opinions are my own, and in no way connected to the Library of Congress.*

  Tony Stubblebine [09.30.07 08:37 PM]

Great that it's now available in bulk, but your post doesn't say where it came from. You call out the LoC for their poor response so I assume they didn't provide it to you. Did you steal it? Did you pay for it? Seems like there's interesting back story here.

  Peter Brantley [09.30.07 09:02 PM]

Hello -

Records can be easily obtained through the U.S. Copyright Office, which is what Public Resource did. Nothing was "stolen" from the Copyright Office.

  Matt Raymond [10.02.07 06:37 AM]

The term "blogospheric brouhaha" was mine as an individual blogger and it was in no way an attempt to downplay this issue. It was an alliterative way to point out that there has been a lot of debate about this in the blogosphere.

And I think it is fair to point out that that term was not how the Library of Congress "solely" responded as an institution. Although you linked to it above, in case people do not follow your link, the entire response is below:

Regarding Pricing on Bulk Access to Copyright Cataloging Information

Recent questions and concerns have arisen regarding the cost of providing the Copyright Cataloging database subscription service to the public.

The U.S. Copyright Office neither sets the price nor receives any direct revenue from the sale of the Copyright Cataloging database. Rather, access to these records is a service offered through the Cataloging Distribution Service (CDS) of the Library of Congress, which is mandated by Congress to provide this and other services to the public at a charge of production and distribution cost plus 10%. In fact, the mission of CDS is to share the Library’s vast bibliographic resources with American libraries, the American people and the international information community on a cost-recovery basis.

These databases and their weekly updates require considerable personnel and other resources to maintain and deliver. Each year, CDS evaluates its implementation and maintenance costs and determines pricing of its many products based on these costs. At the close of the fiscal year on Sept. 30, CDS will make recommendations to Library management for cost adjustment on all its products and services, based upon its Congressional mandate.

Fortunately, recent cost savings realized within CDS are anticipated to result in a drop in the price of many services available from CDS, including the Copyright Cataloging database subscription service. Any new pricing structure will appear first at on the CDS Web site www.loc.gov/cds/ in late October or early November 2007, then in the 2008 CDS Catalog of Products in January 2008.

Finally, the Copyright database is accessible to all free of charge on a record-by-record basis through the U.S. Copyright Office Web site at www.copyright.gov/records/.

  Karen Coyle [10.02.07 08:20 AM]

Peter, I looked at some of the records and I'm not exactly sure how we can make use of them. The data is very minimal. The ones for books that I found have only the title, no author, no nothing else. And sometimes even the title is wrong:


=LDR 00450npc 22001452i 4500
=001 5536409
=005 20070612040307.3
=008 070612n20030411xx\||||||||||||||||||||\\
=017 \\$eV3497D681$f2003-04-11$rV3497 D681-682 P1-36$2usco
=027 \\$aV3497D681
=035 \\$a(DLC-CO)V 00349768100050
=040 \\$aDLC-CO$cDLC-CO
=245 04$aThe buck pases Flynn;$hbook.
=787 0\$t2025 & 131 other titles. (Part 001 of 002)$wV 00349768100000
=917 \\$aV 00349768100050


(That should be "passes".) So, is there more information somewhere? I can't believe that anyone would pay good money for this!

  Peter Brantley [10.02.07 09:09 AM]

Karen,

per Carl:

"Usually, you combine this data with the Marc records from the card catalog or some other source."

Random searching indicates that many records are quite small; I think we do not pull the Notes field, but if you do an record search for this item in the Copyright's query interface, this is pretty much what you get. Try it on others, and let us know.

  Karen Coyle [10.02.07 11:17 AM]

Peter, you have to have enough data to match it to the MARC records -- that's what I was specifically looking for. This is far, far from enough. I looked at the records on the copyright database that these link to -- that has a date and the names of the copyright holders, although just their names. This one pulls up a long list of companies, starting with:

Artisan Entertainment, Artisan Entertainment Inc., Artisan Entertainment, Inc., Artisan Television Inc., Artisan Television, Inc., Heatwave Productions Inc., Heatwave Productions, Inc., ...

It also gives you:

Date of Recordation: 	 2003-04-11
Entire Copyright Document: 	V3497 D681-682 P1-36
Date of Execution: 	3Dec02

"Date of recordation?" ;-) Now THAT's federal speak if I ever heard it. But the main thing as that we can see that there is a full document someplace, albeit probably not in machine-readable form.

If we ever do get MARC records connected to these, we need to upgrade the copyright database with decent bibliographic data.

  Karen Coyle [10.02.07 11:23 AM]

Peter, don't have enough data to match to the MARC records -- that's what I was specifically looking for. This is far, far from enough. I did find some records that have a bit more -- publisher and date of publication. That's still iffy if there is more than one edition, but it might be a start.
If we ever do get MARC records connected to these, we need to upgrade the copyright database with decent bibliographic data.


I looked at the record on the copyright database that this links to (and that presumably is retrievable by the record number) -- that has a date and the names of the copyright holders, although just their names. This one pulls up a long list of companies, starting with:


Artisan Entertainment, Artisan Entertainment Inc., Artisan Entertainment, Inc., Artisan Television Inc., Artisan Television, Inc., Heatwave Productions Inc., Heatwave Productions, Inc., ...


It also gives you:

Date of Recordation: 	 2003-04-11
Entire Copyright Document: 	V3497 D681-682 P1-36
Date of Execution: 	3Dec02

"Date of recordation?" ;-) Now THAT's federal speak if I ever heard it. But the main thing is that we can see that there is a full document someplace, albeit probably not in machine-readable form.

  Ed Summers [06.24.08 10:27 PM]

Wow, and it looks like the daily updates are available via RSS and Atom! http://rss.resource.org/ nice work public.resource.org!

Post A Comment:

 (please be patient, comments may take awhile to post)






Type the characters you see in the picture above.

RECOMMENDED FOR YOU

RECENT COMMENTS