Nat has chaired the O'Reilly Open Source Convention and other O'Reilly conferences for over a decade. He ran the first web server in New Zealand, co-wrote the best-selling Perl Cookbook, and was one of the founding Radar bloggers. He lives in New Zealand and consults in the Asia-Pacific region.
TweetNLP — CMU open source natural language parsing tools for making sense of Tweets.
Interview with Google X Life Science’s Head (Medium) — I will have been here two years this March. In nineteen months we have been able to hire more than a hundred scientists to work on this. We’ve been able to build customized labs and get the equipment to make nanoparticles and decorate them and functionalize them. We’ve been able to strike up collaborations with MIT and Stanford and Duke. We’ve been able to initiate protocols and partnerships with companies like Novartis. We’ve been able to initiate trials like the baseline trial. This would be a good decade somewhere else. The power of focus and money.
Schooloscope Open Data Post-Mortem — The case of Schooloscope and the wider question of public access to school data challenges the belief that sunlight is the best disinfectant, that government transparency would always lead to better government, better results. It challenges the sentiments that see data as value-neutral and its representation as devoid of politics. In fact, access to school data exposes a sharp contrast between the private interest of the family (best education for my child) and the public interest of the government (best education for all citizens).
M-Lab Observatory — explorable data on the data experience (RTT, upload speed, etc) across different ISPs in different geographies over time.
Build Quality In — an e-book collection of Continuous Delivery and DevOps experience reports from the wild. Work in progress, and a collection of accumulated experience in the new software engineering practices can’t be a bad thing.
Designing for Large-Screen Cellphones (Luke Wroblewski) — In his analysis of 1,333 observations of smartphones in use, Steven Hoober found about 75% of people rely on their thumb and 49% rely on a one-handed grip to get things done on their phones. On large screens (over four inches) those kinds of behaviors can stretch people’s thumbs well past their comfort zone as they try to reach controls positioned at the top of their device. Design advice to create interactions that don’t strain tendons or gray matter.
fastsocket (Github) — a highly scalable socket and its underlying networking implementation of Linux kernel. With the straight linear scalability, Fastsocket can provide extremely good performance in multicore machines.
Content Moderation (Wired) — “content moderators” are the people paid to weed out beheadings, pornography, etc. from photo and video sites. By at least one estimate, the number of content moderators scrubbing the world’s social media sites, mobile apps, and cloud storage services runs to “well over 100,000”—that is, about twice the total head count of Google and nearly 14 times that of Facebook.
PaGMO — Parallel Global Multiobjective Optimizer […] a generalization of the island model paradigm working for global and local optimization algorithms. Its main parallelization approach makes use of multiple threads, but MPI is also implemented and can be mixed in with multithreading. PaGMO can be used to solve in a parallel fashion, global optimization tasks.
Avoiding the Tragedy of the Anticommons — Many people talk about “open source biology.” Mike Loukides pulls apart open source and biology to see what the relationship might be. I’m still chewing on what devops for bio would be. Modern software systems throw off gigabytes of data, and we have built tools to monitor those systems, archive their data, and automate much of the analysis. There are free and commercial packages for logging and monitoring, and it continues to be a very active area of software development, as anyone who’s attended O’Reilly’s Velocity conference knows.
peppytides (Makezine) — 3d-printed super accurate, scaled 3D-model of a polypeptide chain that can be folded into all the basic protein structures, like α-helices, β-sheets, and β-turns. (via Lenore Edman)
London Data Store — dashboard and open data catalogue for City of London’s data release efforts.
Creating Empathy on Facebook (NY Times) — On Facebook, teenagers are presented with more options than just “it’s embarrassing” when they want to remove a post. They are asked what’s happening in the post, how they feel about it and how sad they are. In addition, they are given a text box with a polite pre-written response that can be sent to the friend who hurt their feelings. (In early versions of this feature, only 20 percent of teenagers filled out the form. When Facebook added more descriptive language like “feelings” and “sadness,” the figure grew to 80 percent.)
Gearpump — Intel’s “actor-driven streaming framework”, initial benchmarks shows that we can process 2 million messages/second (100 bytes per message) with latency around 30ms on a cluster of 4 nodes.
Foundations of Data Science (PDF) — These notes are a first draft of a book being written by Hopcroft and Kannan [of Microsoft Research] and in many places are incomplete. However, the notes are in good enough shape to prepare lectures for a modern theoretical course in computer science.
The Delusions of Big Data (IEEE) — When you have large amounts of data, your appetite for hypotheses tends to get even larger. And if it’s growing faster than the statistical strength of the data, then many of your inferences are likely to be false. They are likely to be white noise.
ROSCON 2014 — slides and videos of talks from Chicago open source robotics conference.
Making Sure Crypto Stays Insecure (PDF) — Daniel J. Bernstein talk: This talk is actually a thought experiment: how could an attacker manipulate the ecosystem for insecurity?
Material Design Icons — Google’s CC-licensed (attribution, sharealike) collection of sweet, straightforward icons.
Fix Mac OS X — each time you start typing in Spotlight (to open an application or search for a file on your computer), your local search terms and location are sent to Apple and third parties (including Microsoft) under default settings on Yosemite (10.10). See also Net Monitor, an open source toolkit for finding phone-home behaviour.
A/B Testing at Netflix (ACM) — Using a combination of static analysis to build a dependency tree, which is then consumed at request time to resolve conditional dependencies, we’re able to build customized payloads for the millions of unique experiences across Netflix.com.
Leslie Lamport Interview Summary — One idea about formal specifications that Lamport tries to dispel is that they require mathematical capabilities that are not available to programmers: “The mathematics that you need in order to write specifications is a lot simpler than any programming language […] Anyone who can write C code, should have no trouble understanding simple math, because C code is a hell of a lot more complicated than” first-order logic, sets, and functions. When I was at uni, profs worked on distributed data, distributed computation, and formal correctness. We have the first two, but so much flawed software that I can only dream of the third arriving.
Fake Identity — generate fake identity data when testing systems.
Project Naptha — automatically applies state-of-the-art computer vision algorithms on every image you see while browsing the web. The result is a seamless and intuitive experience, where you can highlight as well as copy and paste and even edit and translate the text formerly trapped within an image. Chrome extension. (via Anil Dash)
Garbage Trucks and FedEx Vans (IEEE) — Foo alum, Ian Wright, found traction for his electric car biz by selling powertrains for garbage trucks and Fedex vans. Trucks have 20-30y lifetime, but powertrains are replaced several times; the trucks for fleets are custom; and “The average garbage truck in the U.S. spends $55,000 a year on fuel, and up to $30,000 a year on maintenance, mostly brake replacements.”
Microsoft’s Quantum Mechanics (MIT TR) — the race for the “topological qubit”, involving newly-discovered fundamental particles and large technology companies racing to be the first to make something that works.
Eye Catcher (We Make Money Not Art) — the most banal-looking wooden frame takes thus a life of its own as soon as you come near it. It quickly positions itself in front of you, spots your eyes and starts expressing ‘emotions’ based on your own. Eye Catcher uses the arm of an industrial robot, high power magnets, a hidden pinhole camera, ferrofluid and emotion recognition algorithms to explore novel interactive interfaces based on the mimicry and exchange of expressions.
FORTIS Exoskeleton (Lockheed Martin) — transfers loads through the exoskeleton to the ground in standing or kneeling positions and allows operators to use heavy tools as if they were weightless. (via CNN)
Homebrew Cray-1A – fascinating architecture, but also lovely hobby project to build the homebrew. The lack of Cray software archives horrifies the amateur historian in me, though. When I started building this, I thought “Oh, I’ll just swing by the ol’ Internet and find some groovy 70′s-era software to run on it.” It turns out I was wrong. One of the sad things about pre-internet machines (especially ones that were primarily purchased by 3-letter Government agencies) is that practically no software exists for them. After searching the internet exhaustively, I contacted the Computer History Musuem and they didn’t have any either. They also informed me that apparently SGI destroyed Cray’s old software archives before spinning them off again in the late 90′s.
How Do Committees Invent? — 1968 paper that gave us organizations which design systems […] produce designs which are copies of the communication structures of these organizations. That was the 1968 version of the modern “your website’s sitemap is your org chart”.