- High Volume Web Sites — Tim Berners-Lee answers my question on provisioning a popular web server in 1993. The info.cern.ch server which has the Subject Catalogue gets probably a relatively high usage, about 10k requests a day, or (thinks…) one every 9 seconds. the CPU load is negligible. In fact of course the peak rate is higher, but still its not really a factor. That was when the server forked a subprocess for each request, too. See also one of my early contributions to the nascent field of web operations (language alert).
- Tim Berners-Lee Calls For Web Magna Carta (Guardian) — Unless we have an open, neutral internet we can rely on without worrying about what’s happening at the back door, we can’t have open government, good democracy, good healthcare, connected communities and diversity of culture. It’s not naive to think we can have that, but it is naive to think we can just sit back and get it.
- BroApp — Automatically message your girlfriend sweet things so you can spend more time with the Bros. Reminds me of the Electric Monk in Dirk Gently’s Holistic Detective Agency. The monk notices that humans have machines to watch TV for them. Now we have machines to be shitty boyfriends for us. (via Beta Knowledge)
- World Science U — quick answers, short courses, long MOOCs. I wonder how you’d know whether this was effective at increasing scientific literacy, and therefore whether it’d be worth doing for computational thought or programming.
"Tim Berners-Lee" entries
HTML DRM, Visualizing Medical Sciences, Lifelong Learning, and Hardware Hackery
- What Tim Berners-Lee Doesn’t Know About HTML DRM (Guardian) — Cory Doctorow lays it out straight. HTML DRM is a bad idea, no two ways. The future of the Web is the future of the world, because everything we do today involves the net and everything we’ll do tomorrow will require it. Now it proposes to sell out that trust, on the grounds that Big Content will lock up its “content” in Flash if it doesn’t get a veto over Web-innovation. [...] The W3C has a duty to send the DRM-peddlers packing, just as the US courts did in the case of digital TV.
- Visualizing the Topical Structure of the Medical Sciences: A Self-Organizing Map Approach (PLOSone) — a high-resolution visualization of the medical knowledge domain using the self-organizing map (SOM) method, based on a corpus of over two million publications.
- What Teens Get About The Internet That Parents Don’t (The Atlantic) — the Internet has been a lifeline for self-directed learning and connection to peers. In our research, we found that parents more often than not have a negative view of the role of the Internet in learning, but young people almost always have a positive one. (via Clive Thompson)
- Portable C64 — beautiful piece of C64 hardware hacking to embed a screen and battery in it. (via Hackaday)
Data Jurisdiction, TimBL Frowns, Google Transparency, and Secure Tools
- FISA Amendment Hits Non-Citizens — FISAAA essentially makes it lawful for the US to conduct purely political surveillance on foreigners’ data accessible in US Cloud providers. [...] [A] US judiciary subcommittee on FISAAA in 2008 stated that the Fourth Amendment has no relevance to non-US persons. Americans, think about how you’d feel keeping your email, CRM, accounts, and presentations on Russian or Chinese servers given the trust you have in those regimes. That’s how the rest of the world feels about American-provided services. Which jurisdiction isn’t constantly into invasive snooping, yet still has great bandwidth?
- Tim Berners-Lee Opposes Government Snooping — “The whole thing seems to me fraught with massive dangers and I don’t think it’s a good idea,” he said in reply to a question about the Australian government’s data retention plan.
- Google’s Approach to Government Requests for Information (Google Blog) — they’ve raised the dialogue about civil liberties by being so open about the requests for information they receive. Telcos and banks still regard these requests as a dirty secret that can’t be talked about, whereas Google gets headlines in NPR and CBS for it.
- Open Internet Tools Project — supports and incubates a collection of free and open source projects that enable anonymous, secure, reliable, and unrestricted communication on the Internet. Its goal is to enable people to talk directly to each other without being censored, surveilled or restricted.
The ODI's official launch, MIT's Kinect Kinetics project, and legal ways authorities are tracking us.
Here are a few stories from the data space that caught my attention this week.
Open government data gets a startup incubator
The Open Data Institute (ODI), founded by Tim Berners-Lee and artificial intelligence pioneer Nigel Shadbolt, officially launched this week in the U.K. As Berners-Lee and Shadbolt noted in “There’s gold to be mined from all our data (PDF),” the institute was initially funded and commissioned by the U.K. government to “help the public sector to use its own data more effectively” and that by “[w]orking with private companies and universities, it will also develop the capability of U.K. businesses to exploit open data, fostering a generation of open data entrepreneurs.” The institute’s mission is outlined on its website:
“The Open Data Institute will catalyse the evolution of an open data culture to create economic, environmental, and social value. It will unlock supply, generate demand, create and disseminate knowledge to address local and global issues. We will convene world-class experts to collaborate, incubate, nurture and mentor new ideas, and promote innovation. We will enable anyone to learn and engage with open data, and empower our teams to help others through professional coaching and mentoring.”
Jamillah Knowles reports at The Next Web that the institute is already hosting its first startups, including agile big data specialists Mastodon C; corporate information aggregator OpenCorporates; location-based data startup Placr; and Locatable, a startup aiming to help people find their perfect place to live.
Coinciding with the launch, the institute received an investment boost. As Ingrid Lunden reports at TechCrunch, the U.K. government has committed £10 million over the next five years (about $16 million); this week, investment firm Omidyar Network, co-founded by eBay founder Pierre Omidyar and his wife Pam, invested an additional $750,000 in the ODI. Lunden notes that though the ODI is focused on the U.K., having an international investment company on board “gives the effort a potential profile beyond these borders.”
In related news, O’Reilly Radar’s Alex Howard talked with open government developer Eric Mill, who together with GovTrack.us founder Josh Tauberer and New York Times developer Derek Willis published data and scrapers for legislation in Congress from THOMAS.gov in the public domain at github.com/unitedstates. Mill told Howard he’s hoping this work will serve as an example for government to publish the information themselves in the future:
“It would be fantastic if the relevant bodies published this data themselves and made these datasets and scrapers unnecessary. It would increase the information’s accuracy and timeliness, and probably its breadth. It would certainly save us a lot of work! Until that time, I hope that our approach to this data, based on the joint experience of developers who have each worked with it for years, can model to government what developers who aim to serve the public are actually looking for online.”
You can read Howard’s full interview with Mills about building the scraper and the accompanying dataset, using GitHub as a platform, and how the data is being used here.