- R Library for Chernoff Faces — faces represent the rows of a data matrix by faces. plot.faces plots faces into a scatterplot. Interesting emotional way to visualize data, which was used to good effect (though not with this library) by BERG in Schooloscope. (via the tutorial at Flowing Data)
- Piwik — GPLed web analytics package.
- Pomegranate — a data store for billions of tiny files. (via the High Scalability blog interview with the creator of Pomegranate)
- New Backpack Makes 3D Maps of Buildings — the backpack indoor equivalent of the Google Maps cars, from Berkeley researchers.
ENTRIES TAGGED "nosql"
Data Pointed, CouchDB in the Cloud, Launching Strata
Data Week is a new series that brings together notable stories and developments from the data world. Links in this edition include: the connection between visualizations and art, advice on becoming a data scientist, BigCouch goes open source, and more.
Faces in R, Open Source Web Analytics, Small File Store, Building Mapper
Scientific Literacy, Load Balancing, Indoors Geolocation, and iPhone Security
- The Myth of Scientific Literacy — I’d love it if there was a simple course we could send our elected officials on which would guarantee future science policy would be reliably high quality. Being educated in science (or even “about science”) isn’t going to do it. It’s social connections that will. We need to keep our elected officials honest, constantly check they are applying the evidence we want them to, in the ways we want them to. And if the scientific community want to be listened to, they need to work to build connections. Get political and scientific communities overlapping, embed scientists in policy institutions (and vice versa), get MP’s constituents onside to help foster the sorts of public pressure you want to see: build trust so scientists become people MPs want to be briefed by. (via foe on Twitter)
- Three Papers on Load Balancing (Alex Popescu) — three papers on distributed hash tables.
- Meridian — iPhone app that does in-building location, sample app is the AMNH Explorer which shows you maps of where you are. Uses wifi-based positioning. (via raffi on Twitter)
- Fixing What Apple Won’t — the jailbreakers are releasing security patches for systems that Apple have abandoned. (via ardgedee on Twitter)
Delicious Graphs, Charities and Data, Climate Psychology, Data Structure Portability
- Delicious Links Clustered and Stacked (Matt Biddulph) — six years of his delicious links, k-means clustered by tag and graphed. The clusters are interesting, but I wonder whether Matt can identify significant life/work events by the spikes in the graph.
- Open Data and the Voluntary Sector (OKFN) — Open data will give charities new ways to find and share information on the need of their beneficiaries – who needs their services most and where they are located. The sharing of information will be key to this – it’s not just about using data that the government has opened up, but also opening your own data.
- Cognitive and Behavioral Challenges in Responding to Climate Change — At the deepest level, large scale environmental problems such as global warming threaten people’s sense of the continuity of life – what sociologist Anthony Giddens calls ontological security. Ignoring the obvious can, however, be a lot of work. Both the reasons for and process of denial are socially organized; that is to say, both cognition and denial are socially structured. Denial is socially organized because societies develop and reinforce a whole repertoire of techniques or “tools” for ignoring disturbing problems. Fascinating paper. (via Jez)
- Blueprints — provides a collection of interfaces and implementations to common, complex data structures. Blueprints contains a property graph model its implementations for TinkerGraph, Neo4j, and SAIL. Also, it contains an object document model and implementations for TinkerDoc, CouchDB, and MongoDB. In short, Blueprints provides a one stop shop for implemented interfaces to help developers create software without being tied to particular underlying data management systems.
More NoSQL, Data Medicine, Startups to Government, and Cake-and-eat-it Open Source
- Membase — an open-source (Apache 2.0 license) distributed, key-value database management system optimized for storing data behind interactive web applications. These applications must service many concurrent users; creating, storing, retrieving, aggregating, manipulating and presenting data in real-time. Supporting these requirements, membase processes data operations with quasi-deterministic low latency and high sustained throughput. (via Hacker News)
- Sergey’s Search (Wired) — Sergey Brin, one of the Google founders, learned he had a gene allele that gave him much higher odds of getting Parkinson’s. His response has been to help medical research, both with money and through 23andme. Langston decided to see whether the 23andMe Research Initiative might be able to shed some insight on the correlation, so he rang up 23andMe’s Eriksson, and asked him to run a search. In a few minutes, Eriksson was able to identify 350 people who had the mutation responsible for Gaucher’s. A few clicks more and he was able to calculate that they were five times more likely to have Parkinson’s disease, a result practically identical to the NEJM study. All told, it took about 20 minutes. “It would’ve taken years to learn that in traditional epidemiology,” Langston says. “Even though we’re in the Wright brothers early days with this stuff, to get a result so strongly and so quickly is remarkable.”
- Startup.gov (YouTube) — Anil Dash talk at Personal Democracy Forum on applying insights from startups to government. I hope the more people say this, the greater the odds it’ll be acted on.
- Open Core Software — Marten Mickos (ex-MySQL) talks up “open core” (open source base, proprietary extensions) as a way to resolve the conflict of “change the world with open source” and “make money”. Brian Aker disagrees: There has been no successful launch of an open core company that has reached any significant size, especially of the size that Marten hints at in the article. My take: there are three reasons for open source (freedoms, price, and development scale) and if you close the source to part of your product then the whole product loses those benefits. If you open source enough that the open source bit has massive momentum, then you probably don’t have enough left proprietary to gain huge financial benefit.
Fair Use Economy, Deconstituted Appliances, 3D Vision, Redis for Fun and Profit
- Fair Use in the US Economy (PDF) — prepared by IT lobby in the US, it’s the counterpart to Big ©’s fictitious billions of dollars of losses due to file sharing. Take each with a grain of salt, but this is interesting because it talks about the industries and businesses that the fair use laws make possible.
- Disassembled Household Appliances — neat photos of the pieces in common equipment like waffle irons, sandwich makers, can openers, etc. (via evilmadscientist)
- GelSight — gel block on a sheet of glass, lit from below with lights and then scanned with cameras, lets you easily capture 3D qualities of the objects pressed into it. Very cool demo–you can see finger prints, pulse, and even make out designs on a $100 bill.
- Redis Tutorial (Simon Willison) — Redis is a very fast collection of useful behaviours wrapped around a distributed key-value store. You get locks, IDs, counters, sets, lists, queues, replication, and more.
The growing popularity of Big Data management tools (Hadoop; MPP, real-time SQL, NoSQL databases; and others) means many more companies can handle large amounts of data. But how do companies analyze and mine their vast amounts of data? For companies that already have large amounts of data in Hadoop, there's room for even simpler tools that would allow business users to directly interact with Big Data.
A deep look at Oracle's motivations and MySQL's future
The SimpleGEO CTO and former Digg architect discusses NoSQL and location's future
I recently had a long conversation with Joe Stump, CTO of SimpleGeo, about location, geodata, and the NoSQL movement. Stump, who was formerly lead architect at Digg, had a lot to say. Here’s the highlights, you can find the full interview elsewhere on Radar.
MySQL, MySociety, NoSQL DB, and NoSQL Conference Notes
- Common MySQL Queries — a useful reference.
- MySociety’s Next 12 Months — two new projects, FixMyTransport and “Project Fosbury”. The latter is a more general tool to help people organise their own campaigns for change.
- riak — scalable key-value store with JSON interface. (via joshua on Delicious)
- Notes from NoSQL Live Boston — full of juicy nuggets of info from the NoSQL conference.