- Computer Software Archive (Jason Scott) — The Internet Archive is the largest collection of historical software online in the world. Find me someone bigger. Through these terabytes (!) of software, the whole of the software landscape of the last 50 years is settling in. (And documentation and magazines and …). Wow.
- 7 in 10 Doctors Have a Self-Tracking Patient — the most common ways of sharing data with a doctor, according to the physicians, were writing it out by hand or giving the doctor a paper printout. (via Richard MacManus)
- opsmezzo — open-sourced provisioning tools from the Nodejitsu team. (via Nuno Job)
- Hacking Secret Ciphers with Python — teaches complete beginners how to program in the Python programming language. The book features the source code to several ciphers and hacking programs for these ciphers. The programs include the Caesar cipher, transposition cipher, simple substitution cipher, multiplicative & affine ciphers, Vigenere cipher, and hacking programs for each of these ciphers. The final chapters cover the modern RSA cipher and public key cryptography.
ENTRIES TAGGED "Internet Archive"
Software Archive, Self-Tracking, Provisioning, and Python Ciphers
- /r/Scholar — Reddit board for tracking down research articles of interest.
- The Rapture of the Nerds (Charlie Stoss, Cory Doctorow) — this is the HTML version of the book, which is also available for purchase, and is released under a CC-A-NC-ND license.
- Conversations Network Closes Down — The remaining assets of the Conversations Network (cash and intellectual property) will be acquired by the Internet Archive, another U.S. 501(c)(3) non-profit organization. All existing programs will be moved to the Internet Archive where the world will be able to continue to listen to them for free. (via Jon Udell)
Sorting Out 9/11, Tagging Text, Unlocking Scientific Publishing, and Internet Archive's Meatspace Branch
- Sorting Out 9/11 (New Yorker) — the thorniest problem for the 9/11 memorial was the ordering of the names. Computer science to the rescue!
- Tagger — Python library for extracting tags (statistically significant words or phrases) from a piece of text.
- Free Science, One Paper at a Time (Wired) — Jonathan Eisen’s attempt to collect and distribute his father’s scientific papers (which were written while a federal employee, so in the public domain), thwarted by old-fashioned scientific publishing. “But now,” says Jonathan Eisen, “there’s this thing called the Internet. It changes not just how things can be done but how they should be done.”
- Internet Archive Launches Physical Archive — I’m keen to see how this develops, because physical storage has problems that digital does not. I’d love to see the donor agreement require the donor to give the archive full rights to digitize and distribute under open licenses. That’d put the Internet Archive a step in front of traditional archives, museums, libraries, and galleries, whose donor agreements typically let donors place arbitrary specifications on use and reuse (“must be inaccessible for 50 years”, “no commercial use”, “no use that compromises the work”, etc.), all of which are barriers to wholesale digitization and reuse.
Web Memory, Phones Read Cards, Military and Public Data, and NoSQL Merger
- Erase and Rewind — the BBC are planning to close (delete) 172 websites on some kind of cost-cutting measure. i’m very saddened to see the BBC join the ranks of online services that don’t give a damn for posterity. As Simon Willison points out, the British Library will have archived some of the sites (and Internet Archive others, possibly).
- Announcing Farebot for Android — dumps the information stored on transit cards using Android’s NFC (near field communication, aka RFID) support. When demonstrating FareBot, many people are surprised to learn that much of the data on their ORCA card is not encrypted or protected. This fact is published by ORCA, but is not commonly known and may be of concern to some people who would rather not broadcast where they’ve been to anyone who can brush against the outside of their wallet. Transit agencies across the board should do a better job explaining to riders how the cards work and what the privacy implications are.
- Using Public Data to Fight a War (ReadWriteWeb) — uncomfortable use of the data you put in public?
- CouchOne and Membase Merge — consolidation in the commercial NoSQL arena. the merger not only results in the joining of two companies, but also combines CouchDB, memcached and Membase technologies. Together, the new company, Couchbase, will offer an end-to-end database solution that can be stored on a single server or spread across hundreds of servers.
Image Remapping, Internet Futures, Ebook Reader, and Open Cloud Computing
- Historical Images Remapped — Sydney’s Powerhouse Museum released historical images from their collections, and a historical photo site Sepiatown geolocated and oriented them so they can be viewed side-by-side with current Google Street View images of the same place. And then contributed the refined metadata back to the museum. A great example of your users helping to improve your data.
- Future Internet Scenarios — results of scenario planning by the Internet Society, some possible futures from open and competitive to anticompetitive centralised walled-gardens.
- OpenLibrary Bookreader — the Internet Archive’s book reader is (naturally) open source for you to reuse and improve. (via Kevin Marks on Twitter)
- OpenStack Austin Release — code to compute controller and object storage released. Competition and interoperability require exactly this kind of open cloud environment.
The Internet Archive has successfully pushed back against a federal national security letter (NSL) request for Archive member records. Brewster Kahle, Internet Archive co-founder, director and digital librarian, discussed the NSL process and outcome with the San Francisco Chronicle: Kahle … was appalled when his volunteer lawyers told him in November that the FBI was demanding records of all communications…