Nat has chaired the O'Reilly Open Source Convention and other O'Reilly conferences for over a decade. He ran the first web server in New Zealand, co-wrote the best-selling Perl Cookbook, and was one of the founding Radar bloggers. He lives in New Zealand and consults in the Asia-Pacific region.
Teach Don’t Tell — what I think good documentation is and how I think you should go about writing it. Sample common sense: This is obvious when you’re working face-to-face with someone. When you tell them how to play a C major chord on the guitar and they only produce a strangled squeak, it’s clear that you need to slow down and talk about how to press down on the strings properly. As programmers, we almost never get this kind of feedback about our documentation. We don’t see that the person on the other end of the wire is hopelessly confused and blundering around because they’re missing something we thought was obvious (but wasn’t). Teaching someone in person helps you learn to anticipate this, which will pay off (for your users) when you’re writing documentation.
Molecular Programming Project — aims to develop computer science principles for programming information-bearing molecules like DNA and RNA to create artificial biomolecular programs of similar complexity. Our long-term vision is to establish molecular programming as a subdiscipline of computer science — one that will enable a yet-to-be imagined array of applications from chemical circuitry for interacting with biological molecules to nanoscale computing and molecular robotics.
The Software Analysis Workbench — provides the ability to formally verify properties of code written in C, Java, and Cryptol. It leverages automated SAT and SMT solvers to make this process as automated as possible, and provides a scripting language, called SAW Script, to enable verification to scale up to more complex systems. “Non-commercial” license.
What’s Wrong with Deep Learning? (PDF in Google Drive) — What’s missing from deep learning? 1. Theory; 2. Reasoning, structured prediction; 3. Memory, short-term/working/episodic memory; 4. Unsupervised learning that actually works. … and then ways to get those things. Caution: math ahead.
pinot — a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally.
Naiad: A Timely Dataflow System — in Timely Dataflow, the first two features are needed to execute iterative and incremental computations with low latency. The third feature makes it possible to produce consistent results, at both outputs and intermediate stages of computations, in the presence of streaming or iteration.
What is Code (Paul Ford) — What the coders aren’t seeing, you have come to believe, is that the staid enterprise world that they fear isn’t the consequence of dead-eyed apathy but rather détente. Words and feels.
The Untold Story of Microsoft’s Surface Hub (FastCo) — great press placement from Microsoft, but good to hear what Jeff Han has been working on. And interesting comment on the value of manufacturing in the US: “I don’t have to send my folks over to China, so they’re happier,” Han says. “It’s faster. There’s no language, time, or culture barrier to deal with. To have my engineers go down the hallway to talk to the guys in the manufacturing line and tune the recipe? That’s just incredible.”
Five Years of Google Closure (Derek Slager) — Despite the lack of popularity, a number of companies have successfully used Google Closure for their production applications. Medium, Yelp, CloudKick (acquired by Rackspace), Cue (acquired by Apple), and IMS Health (my company) all use (or have used) Google Closure to power their production applications. And, importantly, the majority of Google’s flagship applications continue to be powered by Google Closure.
Moving Fast with Software Verification (Facebook) — This paper describes our experience in integrating a verification tool based on static analysis into the software development cycle at Facebook. Contains a brief description of dev and release processes at Facebook: no QA …
The Declarative Imperative (Morning Paper) — on Dataflow. …a large class of recursive programs – all of basic Datalog – can be parallelized without any need for coordination. As a side note, this insight appears to have eluded the MapReduce community, where join is necessarily a blocking operator.
Consensual Reality (Alistair Croll) — Among other things we discussed what Inbar calls his three rules for augmented reality design: 1. The content you see has to emerge from the real world and relate to it. 2. Should not distract you from the real world; must add to it. 3. Don’t use it when you don’t need it. If a film is better on the TV watch the TV.
X-Rays Behaving Badly — According to the report, medical devices – in particular so-called picture archive and communications systems (PACS) radiologic imaging systems – are all but invisible to security monitoring systems and provide a ready platform for malware infections to lurk on hospital networks, and for malicious actors to launch attacks on other, high value IT assets. Among the revelations contained in the report: A malware infection at a TrapX customer site spread from a unmonitored PACS system to a key nurse’s workstation. The result: confidential hospital data was secreted off the network to a server hosted in Guiyang, China. Communications went out encrypted using port 443 (SSL) and were not detected by existing cyber defense software, so TrapX said it is unsure how many records may have been stolen.
The Online Privacy Lie is Unraveling (TechCrunch) — The report authors’ argue it’s this sense of resignation that is resulting in data tradeoffs taking place — rather than consumers performing careful cost-benefit analysis to weigh up the pros and cons of giving up their data (as marketers try to claim). They also found that where consumers were most informed about marketing practices they were also more likely to be resigned to not being able to do anything to prevent their data being harvested. Something that didn’t make me regret clicking on a TechCrunch link.
Psychology of Software Architecture — a wonderful piece of writing, but this stood out: It comes down to behavioral economics and game theory. The license we choose modifies the economics of those who use our work.
Internet Users Increasingly Blocking Ads, Including on Mobiles (The Economist) — mobile networks working on ad blockers for their customers, If lots of mobile subscribers did switch it on, it would give European carriers what they have long sought: some way of charging giant American online firms for the strain those firms put on their mobile networks. Google and Facebook, say, might have to pay the likes of Deutsche Telekom and Telefónica to get on to their whitelists.
Connasence (Wikipedia) — a taxonomy of (systems) coupling. Two components are connascent if a change in one would require the other to be modified in order to maintain the overall correctness of the system. (Via Ben Gracewood.)
New Hardware and the Internet of Things (Jon Bruner) — The Internet of Things and the new hardware movement are not the same thing. The new hardware movement is driven by new tools for: Prototyping (inexpensive 3D printers, CNC machine tools, cheap and powerful microcontrollers, high-level programming languages on embedded systems); Fundraising and business development (Highway1, Lab IX); Manufacturing (PCH, Seeed); Marketing (Etsy, Quirky). The IoT is driven by: Ubiquitous connectivity; Cheap hardware (i.e., the new hardware movement); Inexpensive data processing and machine learning.
OpenCV 3.0 Released — I hadn’t realised how much hardware acceleration comes out of the box with OpenCV.
FBI: Companies Should Help us Prevent Encryption (WaPo) — as Mike Loukides says, we are in a Post-Modern age where we don’t trust our computers and they don’t trust us. It’s jarring to hear the organisation that (over-zealously!) investigates computer crime arguing that citizens should not be able to secure their communications. It’s like police arguing against locks.
cockroach — a scalable, geo-replicated, transactional datastore. The Wired piece about it drops the factoid that the creators of GIMP worked on Google’s massive BigTable-successor, Colossus. From Photoshop-alike to massive file systems. Love it.
Pocket Guide to DARPA Robotics Challenge Finals (Robohub) — The robots will start in a vehicle, drive to a simulated disaster building, and then they’ll have to open doors, walk on rubble, and use tools. Finally, they’ll have to climb a flight of stairs. The fastest team with the same amount of points for completing tasks will win. The main issues teams will face are communications with their robot and battery life: “Even the best batteries are still roughly 10 times less energy-dense than the kinds of fuels we all use to get around,” said Pratt.
Monolith First — echoes the idea that platforms should come from successful apps (the way AWS emerged from operating the Amazon store) rather than be designed before use.
Building a More Assured Hardware Security Module (PDF) — proposal for An open source reference design for HSMs; Scalable, first cut in an FPGA and CPU, later allow higher speed options; Composable, e.g. “Give me a key store and signer suitable for DNSsec”; Reasonable assurance by being open, diverse design team, and an increasingly assured tool-chain. See cryptech.is for more info.
How to Design Applied Filters — The most frequently observed issue during usability testing were filtering values changing placement when the user applied them – either to another position in the list of filtering values (typically the top) or to an “Applied filters” summary overview. During testing, the subjects were often confounded as they noticed that the filtering value they just clicked was suddenly “no longer there.”
Twitter Heron — a real-time analytics platform that is fully API-compatible with Storm […] At Twitter, Heron is used as our primary streaming system, running hundreds of development and production topologies. Since Heron is efficient in terms of resource usage, after migrating all Twitter’s topologies to it we’ve seen an overall 3x reduction in hardware, causing a significant improvement in our infrastructure efficiency.
Bayesian Truth Serum — a scoring system for eliciting and evaluating subjective opinions from a group of respondents, in situations where the user of the method has no independent means of evaluating respondents’ honesty or their ability. It leverages respondents’ predictions about how other respondents will answer the same questions. Through these predictions, respondents reveal their meta-knowledge, which is knowledge of what other people know.