Crowdsourcing Isn’t Broken — great rundown of ways to keep crowdsourcing on track. As with open sourcing something, just throwing open the doors and hoping for the best has a low probability of success.
etcd Hits 2.0 — first major stable release of an open source, distributed, consistent key-value store for shared configuration, service discovery, and scheduler coordination.
You Can’t Play 20 Questions With Nature and Win (PDF) — There is, I submit, a view of the scientific endeavor that is implicit (and sometimes explicit) in the picture I have presented above. Science advances by playing 20 questions with nature. The proper tactic is to frame a general question, hopefully binary, that can be attacked experimentally. Having settled that bits-worth, one can proceed to the next. The policy appears optimal – one never risks much, there is feedback from nature at every step, and progress is inevitable. Unfortunately, the questions never seem to be really answered, the strategy does not seem to work. An old paper, but still resonant today. (via Mind Hacks)
The Uncanny Valley of Speech Recognition (Zach Holman) — I’m reminded of driving up US-280 in 2003 or so with @raelity, a Kiwi and a South African trying every permutation of American accent from Kentucky to Yosemite Sam in order to get TellMe to stop giving us the weather for zipcode 10000. It didn’t recognise the swearing either. (Caution: features similarly strong language.)
TuPAQ: An Efficient Planner for Large-scale Predictive Analytic Queries (PDF) — an integrated PAQ [Predictive Analytic Queries] planning architecture that combines advanced model search techniques, bandit resource allocation via runtime algorithm introspection, and physical optimization via batching. The resulting system, TUPAQ, solves the PAQ planning problem with comparable accuracy to exhaustive strategies but an order of magnitude faster, and can scale to models trained on terabytes of data across hundreds of machines.
p2pvc — point-to-point video chat. In an 80×25 terminal window.
Internet of Things: Blackett Review — the British Government’s review of Internet of Things opportunities around government. Government and others can use expert commissioning to encourage participants in demonstrator programmes to develop standards that facilitate interoperable and secure systems. Government as a large purchaser of IoT systems is going to have a big impact if it buys wisely. (via Matt Webb)
rdbms-subsetter — open source tool to generate a random sample of rows from a relational database that preserves referential integrity – so long as constraints are defined, all parent rows will exist for child rows. (via 18F)
UXcheck — a browser extension to help you do a quick UX check against Nielsen’s 10 principles.
MDBM — Yahoo’s fast key-value store, in use for over a decade. Super-fast, using mmap and passing around (gasp) raw pointers.
The Revolution in Biology is Here, Now (Mike Loukides) — I’ve been asked plenty of times (and I’ve asked plenty of times), “what’s the killer product for synthetic biology?” BioFabricate convinced me that that’s the wrong question. We may never have some kind of biological iPod. That isn’t the right way to think. What I saw, instead, was real products that you might never notice. Bricks made from sand that are held together by microbes designed to excrete the binder. Bricks and packing material made from fungus (mycelium). Plastic excreted by bacteria that consume waste methane from sewage plants. You wouldn’t know, or care, whether your plastic Lego blocks are made from petroleum or from bacteria, but there’s a huge ecological difference.
Bluesmart — Indiegogo campaign for a “connected carry-on,” aka a smart suitcase. From the mobile app you can track it, learn when it’s close (or too far away), (un)lock, weigh…and you can plug your devices in and recharge from the built-in battery. Sweet!
Dynomite (Netflix) — a sharding and replication layer. Dynomite can make existing non-distributed datastores, such as Redis or Memcached, into a fully distributed & multi-datacenter replicating datastore.
After Docker — smaller, easier to manage, more secure containers via unikernels and immutable infrastructure.
Pixelapse — something between Dropbox and Github for the design workflow and artifacts.
Obama: Treat Broadband and Mobile as Utility (Ars Technica) — In short, Obama is siding with consumer advocates who have lobbied for months in favor of reclassification while the telecommunications industry lobbied against it.
MozVR — a website, and the tools that made it, designed to be seen through the Oculus Rift.
All Cameras are Police Cameras (James Bridle) — how the slippery slope is ridden: When the Wall was initially constructed, the public were informed that this [automatic license plate recognition] data would only be held, and regularly purged, by Transport for London, who oversee traffic matters in the city. However, within less than five years, the Home Secretary gave the Metropolitan Police full access to this system, which allowed them to take a complete copy of the data produced by the system. This permission to access the data was granted to the Police on the sole condition that they only used it when National Security was under threat. But since the data was now in their possession, the Police reclassified it as “Crime” data and now use it for general policing matters, despite the wording of the original permission. As this data is not considered to be “personal data” within the definition of the law, the Police are under no obligation to destroy it, and may retain their ongoing record of all vehicle movements within the city for as long as they desire.
Angular JS Style Guide — I love style guides, to the point of having posted (I think) three for Angular. Reading other people’s style guides is like listening to them make-up after arguments: you learn what’s important to them, and what they regret.
Consensus Filters — filtering out misreads and other errors to allow all agents, or robots, in the network to arrive at the same value asymptotically by only communicating with their neighbours.
Why Banks are BASE not ACID — Consistency it turns out is not the Holy Grail. What trumps consistency is: Auditing, Risk Management, Availability.
Review Ninja — a lightweight code review tool that works with GitHub, providing a more structured way to use pull requests for code review. ReviewNinja dispenses with elaborate voting systems, and supports hassle-free committing and merging for acceptable changes.
Liquibase — source control for your database. Apache 2.0 licensed.
A Few Useful Things to Know About Machine Learning (PDF) — This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions. My fave: First-timers are often surprised by how little time in a machine learning project is spent actually doing machine learning. But it makes sense if you consider how time-consuming it is to gather data, integrate it, clean it and pre-process it, and how much trial and error can go into feature design.