Making Wrong Code Look Wrong (Joel Spolsky) — This makes mistakes even more visible. Your eyes will learn to “see” smelly code, and this will help you find obscure security bugs just through the normal process of writing code and reading code.
Simple Testing Can Prevent Most Critical Failures — We found the majority of catastrophic failures could easily have been prevented by performing simple testing on error handling code – the last line of defense – even without an understanding of the software design. We extracted three simple rules from the bugs that have lead to some of the catastrophic failures, and developed a static [Java] checker, Aspirator, capable of locating these bugs. One of the tests is a FIXME or TODO in an exception handler.
Quantum Machine Learning Algorithms: Read the Fine Print (Scott Aaronson) — In the years since HHL, quantum algorithms achieving “exponential speedups over classical algorithms” have been proposed for other major application areas […]. With each of them, one faces the problem of how to load a large amount of classical data into a quantum computer (or else compute the data “on-the-fly”), in a way that is efficient enough to preserve the quantum speedup.
Global Forecast System — National Weather Service open sources its weather forecasting software. Hope you have a supercomputer and all the data to make use of it …
High-reproducibility and high-accuracy method for automated topic classification — Latent Dirichlet allocation (LDA) is the state of the art in topic modeling. Here, we perform a systematic theoretical and numerical analysis that demonstrates that current optimization techniques for LDA often yield results that are not accurate in inferring the most suitable model parameters. Adapting approaches from community detection in networks, we propose a new algorithm that displays high reproducibility and high accuracy and also has high computational efficiency. We apply it to a large set of documents in the English Wikipedia and reveal its hierarchical structure.
Comcast (Github) — Comcast is a tool designed to simulate common network problems like latency, bandwidth restrictions, and dropped/reordered/corrupted packets. On BSD-derived systems such as OSX, we use tools like ipfw and pfctl to inject failure. On Linux, we use iptables and tc. Comcast is merely a thin wrapper around these controls.
The UX Reader — This ebook is a collection of the most popular articles from our [MailChimp] UX Newsletter, along with some exclusive content.
Bad Assumptions — Apple lost more money to currency fluctuations than Google makes in a quarter.
Note and Vote (Google Ventures) — nifty meeting hack to surface ideas and identify popular candidates to a decision maker.
Applying Psychology to Improve Online Behaviour — online game runs massive experiments (w/researchers to validate findings) to improve the behaviour of their players. Some of Riot’s experiments are causing the game to evolve. For example, one product is a restricted chat mode that limits the number of messages abusive players can type per match. It’s a temporary punishment that has led to a noticeable improvement in player behavior afterward —on average, individuals who went through a period of restricted chat saw 20 percent fewer abuse reports filed by other players. The restricted chat approach also proved 4 percent more effective at improving player behavior than the usual punishment method of temporarily banning toxic players. Even the smallest improvements in player behavior can make a huge difference in an online game that attracts 67 million players every month.
Decentralised Autonomous Corporations — Charlie Stross’s near-future fiction of Accelerando comes closer to reality: Malice – revenge for waking him up – sharpens Manfred’s voice. “The president of agalmic.holdings.root.184.97.AB5 is agalmic.holdings.root.184.97.201. The secretary is agalmic.holdings.root.184.D5, and the chair is agalmic.holdings.root.184.E8.FF. All the shares are owned by those companies in equal measure, and I can tell you that their regulations are written in Python. Have a nice day, now!” He thumps the bedside phone control and sits up, yawning, then pushes the do-not-disturb button before it can interrupt again. After a moment he stands up and stretches, then heads to the bathroom to brush his teeth, comb his hair, and figure out where the lawsuit originated and how a human being managed to get far enough through his web of robot companies to bug him.
Coding is Not the New Literacy (Chris Grainger) — We build mental models of everything – from how to tie our shoes to the way macro-economic systems work. With these, we make decisions, predictions, and understand our experiences. If we want computers to be able to compute for us, then we have to accurately extract these models from our heads and record them. Writing Python isn’t the fundamental skill we need to teach people. Modeling systems is. Amen!
Pattern — a web mining module for the Python programming language. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and <canvas> visualization.
Reset (Rowan Simpson) — It was a bit chilling to go back over a whole years worth of tweets and discover how many of them were just junk. Visiting the water cooler is fine, but somebody who spends all day there has no right to talk of being full.
Google’s AI Brain — on the subject of Google’s AI ethics committee … Q: Will you eventually release the names? A: Potentially. That’s something also to be discussed. Q: Transparency is important in this too. A: Sure, sure. Such reassuring.
AVA is now Open Source (Laura Bell) — Assessment, Visualization and Analysis of human organisational information security risk. AVA maps the realities of your organisation, its structures and behaviors. This map of people and interconnected entities can then be tested using a unique suite of customisable, on-demand, and scheduled information security awareness tests.
Deep Learning for Torch (Facebook) — Facebook AI Research open sources faster deep learning modules for Torch, a scientific computing framework with wide support for machine learning algorithms.
The Devops Identity Crisis (Baron Schwartz) — I saw one framework-retailing bozo saying that devops was the art of ensuring there were no flaws in software. I didn’t know whether to cry or keep firing until the gun clicked.
Apache Giraph — an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.
Apache Flink — a data processing system and an alternative to Hadoop’s MapReduce component. It comes with its own runtime, rather than building on top of MapReduce. As such, it can work completely independently of the Hadoop ecosystem. However, Flink can also access Hadoop’s distributed file system (HDFS) to read and write data, and Hadoop’s next-generation resource manager (YARN) to provision cluster resources. Since most Flink users are using Hadoop HDFS to store their data, we ship already the required libraries to access HDFS.
Internet of Things: Blackett Review — the British Government’s review of Internet of Things opportunities around government. Government and others can use expert commissioning to encourage participants in demonstrator programmes to develop standards that facilitate interoperable and secure systems. Government as a large purchaser of IoT systems is going to have a big impact if it buys wisely. (via Matt Webb)
rdbms-subsetter — open source tool to generate a random sample of rows from a relational database that preserves referential integrity – so long as constraints are defined, all parent rows will exist for child rows. (via 18F)
UXcheck — a browser extension to help you do a quick UX check against Nielsen’s 10 principles.
The Toxoplasma of Rage — It’s in activists’ interests to destroy their own causes by focusing on the most controversial cases and principles, the ones that muddy the waters and make people oppose them out of spite. And it’s in the media’s interest to help them and egg them on.
Samza: LinkedIn’s Stream-Processing Engine — Samza’s goal is to provide a lightweight framework for continuous data processing. Unlike batch processing systems such as Hadoop, which typically has high-latency responses (sometimes hours), Samza continuously computes results as data arrives, which makes sub-second response times possible.