Hidden Biases in Big Data — with every big data set, we need to ask which people are excluded. Which places are less visible? What happens if you live in the shadow of big data sets? (via Quinn Norton)
CoreObject — a version-controlled object database for Objective-C that supports powerful undo, semantic merging, and real-time collaborative editing.
Machine Learning Done Wrong — [M]ost practitioners pick the modeling algorithm they are most familiar with rather than pick the one which best suits the data. In this post, I would like to share some common mistakes (the don’t-s).
Bandits for Recommendations — A common problem for internet-based companies is: which piece of content should we display? Google has this problem (which ad to show), Facebook has this problem (which friend’s post to show), and RichRelevance has this problem (which product recommendation to show). Many of the promising solutions come from the study of the multi-armed bandit problem.
Droplets — the Droplet is almost spherical, can self-right after being poured out of a bucket, and has the hardware capabilities to organize into complex shapes with its neighbors due to accurate range and bearing. Droplets are available open-source and use cheap vibration motors and a 3D printed shell. (via Robohub)
Apple’s App Store Approval Guidelines — some of the plainest English I’ve seen, especially the Introduction. I can only aspire to that clarity. If your App looks like it was cobbled together in a few days, or you’re trying to get your first practice App into the store to impress your friends, please brace yourself for rejection. We have lots of serious developers who don’t want their quality Apps to be surrounded by amateur hour.
Cockroach — a distributed key/value datastore which supports ACID transactional semantics and versioned values as first-class features. The primary design goal is global consistency and survivability, hence the name. Cockroach aims to tolerate disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention. Cockroach nodes are symmetric; a design goal is one binary with minimal configuration and no required auxiliary services.
Linux Foundation Providing for Core Infrastructure Projects — press release, but interested in how they’re tackling sustainability—they’re taking on identifying worthies (glad I’m not the one who says “you’re not worthy” to a project) and being the non-profit conduit for the dosh. Interesting: implies they think the reason companies weren’t supporting necessary open source projects was some combination of being unsure who to support (projects you use, surely?) and how to get them money (ask?). (Sustainability of open source projects is a pet interest of mine)
Ferry — helps you create big data clusters on your local machine. Define your big data stack using YAML and share your application with Dockerfiles. Ferry supports Hadoop, Cassandra, Spark, GlusterFS, and Open MPI.
What Google Told SEC — For example, a few years from now, we and other companies could be serving ads and other content on refrigerators, car dashboards, thermostats, glasses, and watches, to name just a few possibilities. The only thing they make that people want to buy is the ad space around what you’re actually trying to do.
The Indie Bubble is Popping (Jeff Vogel) — gamers’ budgets and the number of hours in the day to play games are not increasing at the rate at which the number of games on the market is increasing.
Everything is Broken (Quinn Norton) — Computers have gotten incredibly complex, while people have remained the same gray mud with pretensions of Godhood. Today’s required read, because everything is broken and it’s the defining characteristic of this age of software. We have built computers in our image: our cancerous STD-addled diabetic alcoholic lead-sniffing telomere-decaying bacteria- and virus-addled image.
Mozilla’s Winter of Security — Students who have to perform a semester project as part of their university curriculum can apply to one of the MWOS project. Projects are guided by a Mozilla Adviser, and a University Professor. Students are graded by their University, based on success criteria identified at the beginning of the project. Mozilla Advisers allocate up to 2 hours each week to their students, typically on video-conference, to discuss progress and roadblocks.
Seven Deadly Myths of Autonomy (PDF) — it’s easy to fall prey to the fallacy that automated assistance is a simple substitute or multiplier of human capability because, from the point of view of an outsider observing the assisted human, it seems that—in successful cases, at least—the people are able to perform the task faster or better than they could without help. In reality, however, help of whatever kind doesn’t simply enhance our abilities to perform the task: it changes the nature of the task.
Quill — open source in-browser rich text editor. People, while you keep making me type into naked TEXTBOX fields, I’m going to keep posting links to these things.
Hardening Android for Security and Privacy — a brilliant project! prototype of a secure, full-featured, Android telecommunications device with full Tor support, individual application firewalling, true cell network baseband isolation, and optional ZRTP encrypted voice and video support. ZRTP does run over UDP which is not yet possible to send over Tor, but we are able to send SIP account login and call setup over Tor independently.
The Great Smartphone War (Vanity Fair) — “I represented [the Swedish telecommunications company] Ericsson, and they couldn’t lie if their lives depended on it, and I represented Samsung and they couldn’t tell the truth if their lives depended on it.” That’s the most catching quote, but interesting to see Samsung’s patent strategy described as copying others, delaying the lawsuits, settling before judgement, and in the meanwhile ramping up their own innovation. Perhaps the other glory part is the description of Samsung employee shredding and eating incriminating documents while stalling lawyers out front. An excellent read.
socketcluster — highly scalable realtime WebSockets based on Engine.io. They have screenshots of 100k messages/second on an 8-core EC2 m3.2xlarge instance.
Machine Learning on a Board — everything good becomes hardware, whether in GPUs or specialist CPUs. This one has a “Machine Learning Co-Processor”. Interesting idea, to package up inputs and outputs with specialist CPU, but I wonder whether it’s a solution in search of a problem. (via Pete Warden)