Machine Learning Done Wrong — When dealing with small amounts of data, it’s reasonable to try as many algorithms as possible and to pick the best one since the cost of experimentation is low. But as we hit “big data,” it pays off to analyze the data upfront and then design the modeling pipeline (pre-processing, modeling, optimization algorithm, evaluation, productionization) accordingly.
Ten Simple Rules for Lifelong Learning According to Richard Hamming (PLoScompBio) — Exponential growth of the amount of knowledge is a central feature of the modern era. As Hamming points out, since the time of Isaac Newton (1642/3-1726/7), the total amount of knowledge (including but not limited to technical fields) has doubled about every 17 years. At the same time, the half-life of technical knowledge has been estimated to be about 15 years. If the total amount of knowledge available today is x, then in 15 years the total amount of knowledge can be expected to be nearly 2x, while the amount of knowledge that has become obsolete will be about 0.5x. This means that the total amount of knowledge thought to be valid has increased from x to nearly 1.5x. Taken together, this means that if your daughter or son was born when you were 34 years old, the amount of knowledge she or he will be faced with on entering university at age 17 will be more than twice the amount you faced when you started college.
Adopting Microservices at Netflix: Lessons for Architectural Design — you want to think of servers like cattle, not pets. If you have a machine in production that performs a specialized function, and you know it by name, and everyone gets sad when it goes down, it’s a pet. Instead you should think of your servers like a herd of cows. What you care about is how many gallons of milk you get. If one day you notice you’re getting less milk than usual, you find out which cows aren’t producing well and replace them. People for Ethical Treatment of Iron, your time has come!
Your Job is Not to Write Code (Laura Klein) — I know what you’re thinking. This will all take so long! I’ll be so much less effective! This isn’t true. You’ll be far more effective because you will actually be doing your job. Amen to it all.
Duplicate SSH Keys Everywhere — It looks like all devices with the fingerprint are Dropbear SSH instances that have been deployed by Telefonica de Espana. It appears that some of their networking equipment comes set up with SSH by default, and the manufacturer decided to reuse the same operating system image across all devices.
Style.ONS — UK govt style guide covers the elements of writing about statistics. It aims to make statistical content more open and understandable, based on editorial research and best practice. (via Hadley Beeman)
Warren Ellis on the Apple Watch — I, personally, want to put a gold chain on my phone, pop it into a waistcoat pocket, and refer to it as my “digital fob watch” whenever I check the time on it. Just to make the point in as snotty and high-handed a way as possible: This is the decadent end of the current innovation cycle, the part where people stop having new ideas and start adding filigree and extra orifices to the stuff we’ve got and call it the future.
The Home and the Mobile Supply Chain (Benedict Evans) — the small hardware start-up, and the cool new gizmos from drones to wearables, are possible because of the low price of components built at the scale required for Apple and other mobile device makers. (via Matt Webb)
Why I Am Not a Maker (Deb Chachra) — The problem is the idea that the alternative to making is usually not doing nothing — it’s almost always doing things for and with other people, from the barista to the Facebook community moderator to the social worker to the surgeon. Describing oneself as a maker — regardless of what one actually or mostly does — is a way of accruing to oneself the gendered, capitalist benefits of being a person who makes products.
Pattern — a web mining module for the Python programming language. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and <canvas> visualization.
Developer Inequality (Jonathan Edwards) — The bigger injustice is that programming has become an elite: a vocation requiring rare talents, grueling training, and total dedication. The way things are today if you want to be a programmer you had best be someone like me on the autism spectrum who has spent their entire life mastering vast realms of arcane knowledge — and enjoys it. Normal humans are effectively excluded from developing software. (via Slashdot)
Signals From Foo Camp (O’Reilly Radar) — useful for me (aka “the stuff I didn’t get to see”), hopefully useful to you too. Companies outside of Silicon Valley badly want to understand it and want to find ways to truly collaborate with it, but they’re worried that conversations can turn into competition. “Old industry” has incredible expertise and operates in very complex environments, and it has much to teach tech, if tech will listen. Silicon Valley isn’t an IT department for the world, it’s the competition.
Big Data Should Not Be a Faith-Based Initiative (Cory Doctorow) — Re-identification is part of the Big Data revolution: among the new meanings we are learning to extract from huge corpuses of data is the identity of the people in that dataset. And since we’re commodifying and sharing these huge datasets, they will still be around in ten, twenty and fifty years, when those same Big Data advancements open up new ways of re-identifying — and harming — their subjects.
Charlie Stross on 2034 — every object in the real world is going to be providing a constant stream of metadata about its environment — and I mean every object. The frameworks used for channeling this firehose of environment data are going to be insecure and ramshackle, with foundations built on decades-old design errors. (via BoingBoing)
Minimum Viable Bureaucracy (Laura Thomson) — notes from her Velocity talk. A portion of engineer’s time must be spent on what engineer thinks is important. It may be 100%. It may be 60%, 40%, 20%. But it should never be zero.
Reflections on Solid Conference — recap of the conference, great for those of us who couldn’t make it. “Software is eating the world…. Hardware gives it teeth.” – Renee DiResta
Cybernation: The Silent Conquest (1962) — [When] computers acquire the necessary capabilities…speeded-up data processing and interpretation will be necessary if professional services are to be rendered with any adequacy. Once the computers are in operation, the need for additional professional people may be only moderate […] There will be a small, almost separate, society of people in rapport with the advanced computers. These cyberneticians will have established a relationship with their machines that cannot be shared with the average man any more than the average man today can understand the problems of molecular biology, nuclear physics, or neuropsychiatry. Indeed, many scholars will not have the capacity to share their knowledge or feeling about this new man-machine relationship. Those with the talent for the work probably will have to develop it from childhood and will be trained as intensively as the classical ballerina. (via Simon Wardley)
Maximum Happy Imagination (Matt Jones) — questioning the true vision of Marc Andreessen’s recent Twitter discourse on the great future that awaits us. His analogies run out in the 20th century when it comes to the political, social and economic implications of his maximum happy imagination.
The Mirrortocracy — It’s astonishing how many of the people conducting interviews and passing judgement on the careers of candidates have had no training at all on how to do it well. Aside from their own interviews, they may not have ever seen one. I’m all for learning on your own but at least when you write a program wrong it breaks. Without a natural feedback loop, interviewing mostly runs on myth and survivor bias.
Longitude Prize — six prize areas, Grand Challenge style, in clean flight, antibiotic resistance, dementia, food, water, and overcoming paralysis. Mysteriously none for library system that avoids DLL hell.