- Optimizing MongoDB — shorter field names, barely hundreds of ops/s when not in RAM, updates hold a lock while they fetch the original from disk … it’s a pretty grim story. (via Artur Bergman)
- Is There a New Geek Anti-Intellectualism? — focus is absolutely necessary if we are to gain knowledge. We will be ignoramuses indeed, if we merely flow along with the digital current and do not take the time to read extended, difficult texts. (via Sacha Judd)
- Trend Data for Teens (Pew Internet and American Life Project) — one in six American teens have used the Internet to look for information online about a health topic that’s hard to talk about, like drug use, sexual health, or depression.
- The Guts of Android (Linux Weekly News) — technical but high-level explanation of the components of an Android system and how they compare to those of a typical Linux system.
OSCON's co-chairs dig into the OSCON Data program.
OSCON's co-chairs discuss sessions in the OSCON Data conference and the people who might be interested in the associated topics.
The utility of CouchApps and how CouchDB could shape mobile.
OSCON speaker Bradley Holt talks about what CouchDB offers web developers, how the database works with HTML5, and why CouchApps could catch on.
MongoDB Subpessimalization, Anti-Intellectualism, Teen Internet Use, Android Internals
MongoDB for Guardian, Visualization Book, Mobile CouchDB, and Fast Approximate String Retrieval
- Why We Chose MongoDB for Guardian.co.uk (SlideShare) — they’re using MongoDB’s flexible schema, as schema upgrades were pain in their previous system (Oracle). I think of these as the database equivalent of dynamic typing in languages like Perl and Ruby. (via Paul Rowe)
- Solving Problems with Visual Analytics — This book is the result of a community effort of the partners of the VisMaster Coordinated Action funded by the European Union. The overarching aim of VisMaster was to create a research roadmap that outlines the current state of visual analytics across many disciplines, and to describe the next steps that have to be taken to foster a strong visual analytics community, thus enabling the development of advanced visual analytic applications. (via Mark Madsen)
- iOS-Couchbase (GitHub) — a build of distributed key-value store CouchDB, which will keep your mobile data in sync with a remote store. No mean feat given CouchDB itself has Erlang as a dependency. (via Mike Olson)
- SimString — A fast and simple algorithm for approximate string retrieval in C++ with Python and Ruby bindings, opensourced with modified BSD license. (via Matt Biddulph)
Key themes from MySQL 2011. Plus, what you sacrifice when you use a NoSQL solution.
Two dominant themes emerged at MySQL 2011: Mix your relational database with less formal solutions and move to the cloud. This may actually be the best environment MySQL has ever enjoyed.
CouchDB proves a good fit for a project with technical limits.
A new project in Zambia is trying to integrate supervisors, clinics, and community healthcare workers into a system that can improve patient service and provide more data. In this interview, Cory Zue explains how CouchDB is playing a role.
Digital Subscriptions, Graph Database, Data Science, and High Speed Compression
- Digital Subscription Prices — the NY Times in context. Aie.
- Trinity — Microsoft Research graph database. (via Hacker News)
- Data Science Toolkit — prepackaged EC2 image of most useful data tools. (via Pete Warden)
- Snappy — Google’s open sourced compression library, as used in BigTable and MapReduce. Emphasis is on speed, with resulting lack of quality in filesize (20-100% bigger than zlib).
Web Memory, Phones Read Cards, Military and Public Data, and NoSQL Merger
- Erase and Rewind — the BBC are planning to close (delete) 172 websites on some kind of cost-cutting measure. i’m very saddened to see the BBC join the ranks of online services that don’t give a damn for posterity. As Simon Willison points out, the British Library will have archived some of the sites (and Internet Archive others, possibly).
- Announcing Farebot for Android — dumps the information stored on transit cards using Android’s NFC (near field communication, aka RFID) support. When demonstrating FareBot, many people are surprised to learn that much of the data on their ORCA card is not encrypted or protected. This fact is published by ORCA, but is not commonly known and may be of concern to some people who would rather not broadcast where they’ve been to anyone who can brush against the outside of their wallet. Transit agencies across the board should do a better job explaining to riders how the cards work and what the privacy implications are.
- Using Public Data to Fight a War (ReadWriteWeb) — uncomfortable use of the data you put in public?
- CouchOne and Membase Merge — consolidation in the commercial NoSQL arena. the merger not only results in the joining of two companies, but also combines CouchDB, memcached and Membase technologies. Together, the new company, Couchbase, will offer an end-to-end database solution that can be stored on a single server or spread across hundreds of servers.
Microsoft and the Web, URL Library, Optimism, and NoSQL Instruction
- Dive Into 2010 (Mark Pilgrim) — Mark wrote a hugely popular guide to HTML5 which was available online and published by O’Reilly. 6% of visitors used some version of Internet Explorer. That is not a typo. The site works fine in Internet Explorer — the site practices what it preaches, and the live examples use a variety of fallbacks for legacy browsers — so this is entirely due to the subject matter. Microsoft has completely lost the web development community.
- google-url — the Google URL-parsing library, designed to be embeddable.
- Reasons to be Cheerful (Charlie Stross) — if all we ever do is gripe about ways in which the world is not perfect, we will make ourselves miserable and fail to appreciate ways in which things are getting better. Important.
- NoSQL Tapes — videos of lectures on NoSQL topics. (via Hacker News)