Four short links: 17 May 2011

Sorting Out 9/11, Tagging Text, Unlocking Scientific Publishing, and Internet Archive's Meatspace Branch

  1. Sorting Out 9/11 (New Yorker) — the thorniest problem for the 9/11 memorial was the ordering of the names. Computer science to the rescue!
  2. Tagger — Python library for extracting tags (statistically significant words or phrases) from a piece of text.
  3. Free Science, One Paper at a Time (Wired) — Jonathan Eisen’s attempt to collect and distribute his father’s scientific papers (which were written while a federal employee, so in the public domain), thwarted by old-fashioned scientific publishing. “But now,” says Jonathan Eisen, “there’s this thing called the Internet. It changes not just how things can be done but how they should be done.”
  4. Internet Archive Launches Physical Archive — I’m keen to see how this develops, because physical storage has problems that digital does not. I’d love to see the donor agreement require the donor to give the archive full rights to digitize and distribute under open licenses. That’d put the Internet Archive a step in front of traditional archives, museums, libraries, and galleries, whose donor agreements typically let donors place arbitrary specifications on use and reuse (“must be inaccessible for 50 years”, “no commercial use”, “no use that compromises the work”, etc.), all of which are barriers to wholesale digitization and reuse.
