- Why We Chose MongoDB for Guardian.co.uk (SlideShare) — they’re using MongoDB’s flexible schema, as schema upgrades were pain in their previous system (Oracle). I think of these as the database equivalent of dynamic typing in languages like Perl and Ruby. (via Paul Rowe)
- Solving Problems with Visual Analytics — This book is the result of a community effort of the partners of the VisMaster Coordinated Action funded by the European Union. The overarching aim of VisMaster was to create a research roadmap that outlines the current state of visual analytics across many disciplines, and to describe the next steps that have to be taken to foster a strong visual analytics community, thus enabling the development of advanced visual analytic applications. (via Mark Madsen)
- iOS-Couchbase (GitHub) — a build of distributed key-value store CouchDB, which will keep your mobile data in sync with a remote store. No mean feat given CouchDB itself has Erlang as a dependency. (via Mike Olson)
- SimString — A fast and simple algorithm for approximate string retrieval in C++ with Python and Ruby bindings, opensourced with modified BSD license. (via Matt Biddulph)
ENTRIES TAGGED "nosql"
MongoDB for Guardian, Visualization Book, Mobile CouchDB, and Fast Approximate String Retrieval
Key themes from MySQL 2011. Plus, what you sacrifice when you use a NoSQL solution.
Two dominant themes emerged at MySQL 2011: Mix your relational database with less formal solutions and move to the cloud. This may actually be the best environment MySQL has ever enjoyed.
CouchDB proves a good fit for a project with technical limits.
A new project in Zambia is trying to integrate supervisors, clinics, and community healthcare workers into a system that can improve patient service and provide more data. In this interview, Cory Zue explains how CouchDB is playing a role.
Digital Subscriptions, Graph Database, Data Science, and High Speed Compression
- Digital Subscription Prices — the NY Times in context. Aie.
- Trinity — Microsoft Research graph database. (via Hacker News)
- Data Science Toolkit — prepackaged EC2 image of most useful data tools. (via Pete Warden)
- Snappy — Google’s open sourced compression library, as used in BigTable and MapReduce. Emphasis is on speed, with resulting lack of quality in filesize (20-100% bigger than zlib).
Web Memory, Phones Read Cards, Military and Public Data, and NoSQL Merger
- Erase and Rewind — the BBC are planning to close (delete) 172 websites on some kind of cost-cutting measure. i’m very saddened to see the BBC join the ranks of online services that don’t give a damn for posterity. As Simon Willison points out, the British Library will have archived some of the sites (and Internet Archive others, possibly).
- Announcing Farebot for Android — dumps the information stored on transit cards using Android’s NFC (near field communication, aka RFID) support. When demonstrating FareBot, many people are surprised to learn that much of the data on their ORCA card is not encrypted or protected. This fact is published by ORCA, but is not commonly known and may be of concern to some people who would rather not broadcast where they’ve been to anyone who can brush against the outside of their wallet. Transit agencies across the board should do a better job explaining to riders how the cards work and what the privacy implications are.
- Using Public Data to Fight a War (ReadWriteWeb) — uncomfortable use of the data you put in public?
- CouchOne and Membase Merge — consolidation in the commercial NoSQL arena. the merger not only results in the joining of two companies, but also combines CouchDB, memcached and Membase technologies. Together, the new company, Couchbase, will offer an end-to-end database solution that can be stored on a single server or spread across hundreds of servers.
Microsoft and the Web, URL Library, Optimism, and NoSQL Instruction
- Dive Into 2010 (Mark Pilgrim) — Mark wrote a hugely popular guide to HTML5 which was available online and published by O’Reilly. 6% of visitors used some version of Internet Explorer. That is not a typo. The site works fine in Internet Explorer — the site practices what it preaches, and the live examples use a variety of fallbacks for legacy browsers — so this is entirely due to the subject matter. Microsoft has completely lost the web development community.
- google-url — the Google URL-parsing library, designed to be embeddable.
- Reasons to be Cheerful (Charlie Stross) — if all we ever do is gripe about ways in which the world is not perfect, we will make ourselves miserable and fail to appreciate ways in which things are getting better. Important.
- NoSQL Tapes — videos of lectures on NoSQL topics. (via Hacker News)
MySQL as NoSQL, Handmade SLR, Mac App Store, and Datamining Privacy Workshop
- Using MysQL as NoSQL — 750,000+ qps on a commodity MySQL/InnoDB 5.1 server from remote web clients.
- Making an SLR Camera from Scratch — amazing piece of hardware devotion. (via hackaday.com)
- Mac App Store Guidelines — Apple announce an app store for the Macintosh, similar to its app store for iPhones and iPads. “Mac App” no longer means generic “program”, it has a new and specific meaning, a program that must be installed through the App store and which has limited functionality (only one can run at a time, it’s full-screen, etc.). The list of guidelines for what kinds of programs you can’t sell through the App Store is interesting. Many have good reasons to be, but It creates a store inside itself for selling or distributing other software (i.e., an audio plug-in store in an audio app) is pure greed. Some are afeared that the next step is to make the App store the only way to install apps on a Mac, a move that would drive me away. It would be a sad day for Mac-lovers if Microsoft were to be the more open solution than Apple. cf the Owner’s Manifesto.
- Privacy Aspects of Data Mining — CFP for an IEEE workshop in December. (via jschneider on Twitter)
Bad Game Mechanics, Under NoSQL Covers, the LAN of Things, and the Smithsonian Commons
- Pwned: Gamification and its Discontents (Slideshare) — hear, hear! Video games are not fun because they’re video games, but if and only they are well-designed. Just adding something from games isn’t a guarantee for fun. (via jameshome on Twitter)
- Redis Under the Hood — explanation of the insides and mechanisms of this popular distributed key-value store. (via tlockney on delicious)
- The LAN of Things (Mike Kuniavsky) — Before we can have an Internet of Things, we will need to have a LAN of things.[...] Most of the utility of a LAN came from its local functionality. Thus, before we can build a useful (from a user perspective) Internet of Things, we need to learn to build useful LANs of Things. [...] I think it’s important to start thinking about what the highly localized uses of sparsely distributed technology can be. What can we do when there are only a couple of things with RFIDs in our house? What totally great service can be built on having two light switches that report their telemetry in the house? What totally valuable information can you tell me if I only wear my motion sensor every once in a while? Love it. (via Matt Jones on Delicious)
- Mike Edson’s Talk at Powerhouse Museum — the Director of Web and New Media Technology at the Smithsonian is smart, articulate, and trying to do something cool with the Smithsonian Commons prototype. (via sebchan on Twitter)
Storage, MapReduce and Query are ushering in data-driven products and services.
We're at the beginning of a revolution in data-driven products and services, driven by a software stack that enables big data processing on commodity hardware. Learn about the SMAQ stack, and where today's big data tools fit in.