Nat has chaired the O'Reilly Open Source Convention and other O'Reilly conferences for over a decade. He ran the first web server in New Zealand, co-wrote the best-selling Perl Cookbook, and was one of the founding Radar bloggers. He lives in New Zealand and consults in the Asia-Pacific region.
Antilogs — There are companies before you who have done something like you want to do that you can copy from, and others who have also done something similar, but that you choose not to copy from. These are your analogs and antilogs respectively.
Korean Meal-Transport Robot (RoboHub) — the hyphen is important. It transports all meals, not just Korean ones. Interesting not only grammatically, but for the gradual arrival of the service robot.
wit.ai — Natural language processing for the Internet of Things. Startup, racing to build strategic value beyond “have brought voice recognition to irc bots and aimed it at Internet of Things investors.”
Review Ninja — a lightweight code review tool that works with GitHub, providing a more structured way to use pull requests for code review. ReviewNinja dispenses with elaborate voting systems, and supports hassle-free committing and merging for acceptable changes.
PlotDevice — A Python-based graphics language for designers, developers, and tinkerers. More in the easy-to-get-started + visual realm, like Processing. (via Andy Baio)
Scumblr and Sketchy Search — Netflix open sourcing some scraping, screenshot, and workflow tools their security team uses to monitor discussion of themselves.
Should Twitter, Facebook and Google Executives be the Arbiters of What We See and Read? (Glenn Greenwald) — In the digital age, we are nearing the point where an idea banished by Twitter, Facebook and Google all but vanishes from public discourse entirely, and that is only going to become more true as those companies grow even further. Whatever else is true, the implications of having those companies make lists of permitted and prohibited ideas are far more significant than when ordinary private companies do the same thing.
Liquibase — source control for your database. Apache 2.0 licensed.
A Few Useful Things to Know About Machine Learning (PDF) — This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions. My fave: First-timers are often surprised by how little time in a machine learning project is spent actually doing machine learning. But it makes sense if you consider how time-consuming it is to gather data, integrate it, clean it and pre-process it, and how much trial and error can go into feature design.
The Poisoned NUL Byte, 2014 Edition (Project Zero) — from Google’s public security efforts, this detailed public description of how an exploit was constructed from a found vulnerability. They’re helping. Kudos!
Myths About the Coming Robot Economy (Eric Sofge) — the entire discussion of the so-called robot economy, with its predictions of vast, permanent employment rates and glacial productivity gains, is nothing more than a wild guess. A strong pushback on the Pew Report (PDF): Frey and Osborne’s analysis is full of logical leaps, and far-reaching conclusions drawn from cursory observations about robots that have yet to replace humans.
Content for Sensitive Situations (Luke Wroblewski) — People have all kinds of feelings when interacting with your content. When someone’s needs are being met they may feel very different then when their needs are not being met. How can you meet people’s needs?
Urban Villages (Senseable City at MIT) — People who live in a larger town make more calls and call a larger number of different people. The scaling of this relation is ‘superlinear,’ meaning that on average, if the size of a town doubles, the sum of phone contacts in the city will more than double – in a mathematically predictable way. Surprisingly, however, group clustering (the odds that your friends mutually know one another) does not change with city size. It seems that even in large cities we tend to build tightly knit communities, or ‘villages,’ around ourselves. There is an important difference, though: if in a real village our connections might simply be defined by proximity, in a large city we can elect a community based on any number of factors, from affinity to interest to sexual preference. (via Flowing Data)
Dat — an open source project that provides a streaming interface between every file format and data storage backend. See the Wired piece on it.
Smithsonian Crowdsourcing Transcription (Smithsonian) — 49 volunteers transcribed 200 pages of correspondence between the Monuments Men in a week. Soon it’ll be mathematics test questions: “if 49 people transcribe 200 pages in 7 days, how many weeks will it take …”
MIT Guide to Family CompSci Sessions — This guide is for educators, community center staff, and volunteers interested in engaging their young people and their families to become designers and inventors in their community.
Machine Learning for Plant Properties — startup building database of plant genomics, properties, research, etc. for mining. The more familiar you are with your data and its meaning, the better your machine learning will be at suggesting fruitful lines of query … and the more valuable your startup will be.
Dissecting Message Queues — throughput, latency, and qualitative comparison of different message queues. MQs are to modern distributed architectures what function calls were to historic unibox architectures.
1915 Data Visualization Rules — a reminder that data visualization is not new, but research into effectiveness of alternative presentation styles is.