The Sad State of Sysadmin in the Age of Containers (Erich Schubert) — a Grumpy Old Man rant, but solid. And since nobody is still able to compile things from scratch, everybody just downloads precompiled binaries from random websites. Often without any authentication or signature.
Pinball — Pinterest open-sourced their data workflow manager.
Disambiguating Databases (ACM) — The scope of the term database is vast. Technically speaking, anything that stores data for later retrieval is a database. Even by that broad definition, there is functionality that is common to most databases. This article enumerates those features at a high level. The intent is to provide readers with a toolset with which they might evaluate databases on their relative merits.
Hello Barbie — I just can’t imagine a business not wanting to mine and repurpose the streams of audio data coming into their servers. “You listen to Katy Perry a lot. So do I! You have a birthday coming up. Have you told your parents about the Katy Perry brand official action figurines from Mattel? Kids love ’em, and demo data and representative testing indicates you will, too!” Or just offer a subscription service where parents can listen in on what their kids say when they play in the other room with their friends. Or identify product mentions and cross-market offline. Or …
Adopting Microservices at Netflix: Lessons for Architectural Design — you want to think of servers like cattle, not pets. If you have a machine in production that performs a specialized function, and you know it by name, and everyone gets sad when it goes down, it’s a pet. Instead you should think of your servers like a herd of cows. What you care about is how many gallons of milk you get. If one day you notice you’re getting less milk than usual, you find out which cows aren’t producing well and replace them. People for Ethical Treatment of Iron, your time has come!
Your Job is Not to Write Code (Laura Klein) — I know what you’re thinking. This will all take so long! I’ll be so much less effective! This isn’t true. You’ll be far more effective because you will actually be doing your job. Amen to it all.
stenographer (Google) — open source packet dumper for capturing data during intrusions.
Which GPU for Deep Learning? — a lot of numbers. Overall, I think memory size is overrated. You can nicely gain some speedups if you have very large memory, but these speedups are rather small. I would say that GPU clusters are nice to have, but that they cause more overhead than the accelerate progress; a single 12GB GPU will last you for 3-6 years; a 6GB GPU is plenty for now; a 4GB GPU is good but might be limiting on some problems; and a 3GB GPU will be fine for most research that looks into new architectures.
23andMe Wins FDA Approval for First Genetic Test — as they re-enter the market after FDA power play around approval (yes, I know: one company’s power play is another company’s flouting of safeguards designed to protect a vulnerable public).
Calaos — a free software project (GPLv3) that lets you control and monitor your home.
Founder Wants to be a Horse Not a Unicorn (Business Insider) — this way of thinking — all or nothing moonshots to maximise shareholder value — has become pervasive dogma in tech. It’s become the only respectable path. Either you’re running a lowly lifestyle business, making ends meet so you can surf all afternoon, or you’re working 17-hour days goring competitors with your $US48MM Series C unicorn horn on your way to billionaire mountain.
Subjectivity-Exploitability Tradeoff — Voting-based DAOs, lacking an equivalent of shareholder regulation, are vulnerable to attacks where 51% of participants collude to take all of the DAO’s assets for themselves […] The example supplied here will define a new, third, hypothetical form of blockchain or DAO governance. Every day we’re closer to Stross’s Accelerando.
Sahale — open source cascading workflow visualizer to help you make sense of tasks decomposed into Hadoop jobs. (via Code as Craft)
Apache NiFi — incubated open source project for data flow.
Tug Hospital Robot (Wired) — It may have an adult voice, but Tug has a childlike air, even though in this hospital you’re supposed to treat it like a wheelchair-bound old lady. It’s just so innocent, so earnest, and at times, a bit helpless. If there’s enough stuff blocking its way in a corridor, for instance, it can’t reroute around the obstruction. This happened to the Tug we were trailing in pediatrics. “Oh, something’s in its way!” a woman in scrubs says with an expression like she herself had ruined the robot’s day. She tries moving the wheeled contraption but it won’t budge. “Uh, oh!” She shoves on it some more and finally gets it to move. “Go, Tug, go!” she exclaims as the robot, true to its programming, continues down the hall.
Improving the Robustness of Complex Networks with Preserving Community Structure (PLoSone) — To improve robustness while minimizing the above three costly changes, we first seek to verify that the community structure of networks actually do identify the robustness and vulnerability of networks to some extent. Then, we propose an effective 3-step strategy for robustness improvement, which retains the degree distribution of a network, as well as preserves its community structure.
Crowdsourcing Isn’t Broken — great rundown of ways to keep crowdsourcing on track. As with open sourcing something, just throwing open the doors and hoping for the best has a low probability of success.
etcd Hits 2.0 — first major stable release of an open source, distributed, consistent key-value store for shared configuration, service discovery, and scheduler coordination.
You Can’t Play 20 Questions With Nature and Win (PDF) — There is, I submit, a view of the scientific endeavor that is implicit (and sometimes explicit) in the picture I have presented above. Science advances by playing 20 questions with nature. The proper tactic is to frame a general question, hopefully binary, that can be attacked experimentally. Having settled that bits-worth, one can proceed to the next. The policy appears optimal – one never risks much, there is feedback from nature at every step, and progress is inevitable. Unfortunately, the questions never seem to be really answered, the strategy does not seem to work. An old paper, but still resonant today. (via Mind Hacks)
The Uncanny Valley of Speech Recognition (Zach Holman) — I’m reminded of driving up US-280 in 2003 or so with @raelity, a Kiwi and a South African trying every permutation of American accent from Kentucky to Yosemite Sam in order to get TellMe to stop giving us the weather for zipcode 10000. It didn’t recognise the swearing either. (Caution: features similarly strong language.)
TuPAQ: An Efficient Planner for Large-scale Predictive Analytic Queries (PDF) — an integrated PAQ [Predictive Analytic Queries] planning architecture that combines advanced model search techniques, bandit resource allocation via runtime algorithm introspection, and physical optimization via batching. The resulting system, TUPAQ, solves the PAQ planning problem with comparable accuracy to exhaustive strategies but an order of magnitude faster, and can scale to models trained on terabytes of data across hundreds of machines.
p2pvc — point-to-point video chat. In an 80×25 terminal window.