Nat Torkington

Nat has chaired the O'Reilly Open Source Convention and other O'Reilly conferences for over a decade. He ran the first web server in New Zealand, co-wrote the best-selling Perl Cookbook, and was one of the founding Radar bloggers. He lives in New Zealand and consults in the Asia-Pacific region.

Four short links: 23 January 2015

Four short links: 23 January 2015

Investment Themes, Python Web Mining, Code Review, and Sexist Brilliance

  1. 16 Andreessen-Horowitz Investment Areas — I’m struck by how they’re connected: there’s a cluster around cloud development, there are two maybe three on sensors …
  2. Patterna web mining module for the Python programming language. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and <canvas> visualization.
  3. Code Review — FogCreek’s code review checklist.
  4. Expectations of Brilliance Underlie Gender Distributions Across Academic Disciplines (Science) — Surveys revealed that some fields are believed to require attributes such as brilliance and genius, whereas other fields are believed to require more empathy or hard work. In fields where people thought that raw talent was required, academic departments had lower percentages of women. (via WaPo)
Comment
Four short links: 22 January 2015

Four short links: 22 January 2015

MSVR, The Facebook, Social Robots, and Testing Microservices

  1. Microsoft HoloLens Goggles (Wired) — a media release about the next thing from the person behind Kinect. I’m still trying to figure out (as are investors, I’m sure) where in the hype curve this Googles/AR/etc. amalgam lives. Is it only a tech proof-of-concept? Is it a games device like Kinect? Is it good and cheap enough for industrial apps? Or is this the long-awaited climb out of irrelevance for Virtual Reality?
  2. The Facebook (YouTube) — brilliant fake 1995 ad for The Facebook. Excuse me, I’m off to cleanse.
  3. Natural Language in Social Robotics (Robohub) — Natural language interfaces are turning into a de-facto interface convention. Just like the GUI overlapped and largely replaced the command line, NLP is now being used by robots, the Internet of things, wearables, and especially conversational systems like Apple’s Siri, Google’s Now, Microsoft’s Cortana, Nuance’s Nina, Amazon’s Echo and others. These interfaces are designed to simplify, speed up, and improve task completion. Natural language interaction with robots, if anything, is an interface. It’s a form of UX that requires design.
  4. Microservices and Testing (Martin Fowler) — testing across component boundaries, in the face of failing data stores and HTTP timeouts. The first discussion of testing in a web-scale world that I’ve seen from The Mainstream.
Comment
Four short links: 21 January 2015

Four short links: 21 January 2015

Mousey PC, Sad G+, Medium Data, and Upgraded DARPA Contest Robot

  1. PC in a Mouse — 80s = PC in a keyboard. 90s = PC in a box. 2000s = PC in the screen. 2015 we get PC in a mouse. By 2020 will circuitry be inline in the cable or connector?
  2. Estimating G+ Usage (BoingBoing) — of 2.2B profiles, 6.6M have made new public posts in 2015. Yeesh.
  3. Medium Data — too big for one machine, but barely worth the overhead of high-volume data processing.
  4. New Hardware for the DARPA Robotics Challenge Finals (IEEE) — in the future, we’ll all have a 3.7 kwh battery and a wireless router in our heads.
Comment
Four short links: 20 January 2015

Four short links: 20 January 2015

Govt IoT, Collective Intelligence, Unknown Excellence, and Questioning Scalability

  1. Matt Webb Joining British Govt Data Service — working on IoT for them.
  2. Reading the Mind in the Eyes or Reading between the Lines? Theory of Mind Predicts Collective Intelligence (PLoS) — theory of mind abilities are a significant determinant of group collective intelligence even when, as in many online groups, the group has extremely limited communication channels. Phone/Skype calls, emails, and chats are all intensely mental activities, trying to picture the person behind the signal.
  3. MIT Faculty Search — two open gigs at MIT, one around climate change and one “undefined.” Great job ad.
  4. Scalability at What Cost?evaluation of these systems, especially in the academic context, is lacking. Folks have gotten all wound-up about scalability, despite the fact that scalability is just a means to an end (performance, capacity). When we actually look at performance, the benefits the scalable systems bring start to look much more sketchy. We’d like that to change.
Comment
Four short links: 19 January 2015

Four short links: 19 January 2015

Going Offline, AI Ethics, Human Risks, and Deep Learning

  1. Reset (Rowan Simpson) — It was a bit chilling to go back over a whole years worth of tweets and discover how many of them were just junk. Visiting the water cooler is fine, but somebody who spends all day there has no right to talk of being full.
  2. Google’s AI Brain — on the subject of Google’s AI ethics committee … Q: Will you eventually release the names? A: Potentially. That’s something also to be discussed. Q: Transparency is important in this too. A: Sure, sure. Such reassuring.
  3. AVA is now Open Source (Laura Bell) — Assessment, Visualization and Analysis of human organisational information security risk. AVA maps the realities of your organisation, its structures and behaviors. This map of people and interconnected entities can then be tested using a unique suite of customisable, on-demand, and scheduled information security awareness tests.
  4. Deep Learning for Torch (Facebook) — Facebook AI Research open sources faster deep learning modules for Torch, a scientific computing framework with wide support for machine learning algorithms.
Comment
Four short links: 16 January 2015

Four short links: 16 January 2015

RF Snooping, Class and Tech, Nuclear Option, and Carbon Fibre

  1. It’s Getting Easier for Hackers to Spy on Your Computer When It’s Offline (Vice) — surprisingly readable coverage of determining computer activity from RF signals.
  2. An Old Fogey’s Analysis of a Teenager’s View on Social MediaTeens’ use of social media is significantly shaped by race and class, geography, and cultural background.
  3. Putting the Nuclear Option Front and Centre (Tom Armitage) — offering what feels like the nuclear option front and centre, reminding the user that it isn’t a nuclear option. I love this. “Undo” changes your experience profoundly.
  4. 3D-Printing Carbon Fibre (Makezine) — the machine doesn’t produce angular, stealth fighter-esque pieces with the telltale CF pattern seen on racing bikes and souped up Mustangs. Instead, it creates an FDM 3D print out of nylon filament (rather than ABS or PLA), and during the process it layers in a thin strip of carbon fiber, melted into place from carbon fiber fabric using a second extruder head. (It can also add in kevlar or fiberglass.)
Comment
Four short links: 15 January 2015

Four short links: 15 January 2015

Secure Docker Deployment, Devops Identity, Graph Processing, and Hadoop Alternative

  1. Docker Secure Deployment Guidelinesdeployment checklist for securely deploying Docker.
  2. The Devops Identity Crisis (Baron Schwartz) — I saw one framework-retailing bozo saying that devops was the art of ensuring there were no flaws in software. I didn’t know whether to cry or keep firing until the gun clicked.
  3. Apache Giraphan iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.
  4. Apache Flinka data processing system and an alternative to Hadoop’s MapReduce component. It comes with its own runtime, rather than building on top of MapReduce. As such, it can work completely independently of the Hadoop ecosystem. However, Flink can also access Hadoop’s distributed file system (HDFS) to read and write data, and Hadoop’s next-generation resource manager (YARN) to provision cluster resources. Since most Flink users are using Hadoop HDFS to store their data, we ship already the required libraries to access HDFS.
Comment
Four short links: 14 January 2015

Four short links: 14 January 2015

IoT and Govt, Exactly Once, Random Database Subset, and UX Checking

  1. Internet of Things: Blackett Review — the British Government’s review of Internet of Things opportunities around government. Government and others can use expert commissioning to encourage participants in demonstrator programmes to develop standards that facilitate interoperable and secure systems. Government as a large purchaser of IoT systems is going to have a big impact if it buys wisely. (via Matt Webb)
  2. Exactly Once Semantics with Kafka — designing for failure means it’s easier to ensure that things get done than it is to ensure that things get done exactly once.
  3. rdbms-subsetter — open source tool to generate a random sample of rows from a relational database that preserves referential integrity – so long as constraints are defined, all parent rows will exist for child rows. (via 18F)
  4. UXcheck — a browser extension to help you do a quick UX check against Nielsen’s 10 principles.
Comment
Four short links: 13 January 2015

Four short links: 13 January 2015

Slack Culture, Visualizations of Text Analysis, Wearables and Big Data, and Snooping on Keyboards

  1. Building the Workplace We Want (Slack) — culture is the manifestation of what your company values. What you reward, who you hire, how work is done, how decisions are made — all of these things are representations of the things you value and the culture you’ve wittingly or unwittingly created. Nice (in the sense of small, elegant) explanation of what they value at Slack.
  2. Interpretation and Trust: Designing Model-Driven Visualizations for Text Analysis (PDF) — Based on our experiences and a literature review, we distill a set of design recommendations and describe how they promote interpretable and trustworthy visual analysis tools.
  3. The Internet of Things Has Four Big Data Problems (Alistair Croll) — What the IoT needs is data. Big data and the IoT are two sides of the same coin. The IoT collects data from myriad sensors; that data is classified, organized, and used to make automated decisions; and the IoT, in turn, acts on it. It’s precisely this ever-accelerating feedback loop that makes the coin as a whole so compelling. Nowhere are the IoT’s data problems more obvious than with that darling of the connected tomorrow known as the wearable. Yet, few people seem to want to discuss these problems.
  4. Keysweepera stealthy Arduino-based device, camouflaged as a functioning USB wall charger, that wirelessly and passively sniffs, decrypts, logs, and reports back (over GSM) all keystrokes from any Microsoft wireless keyboard in the vicinity. Designs and demo videos included.
Comment
Four short links: 12 January 2015

Four short links: 12 January 2015

Designed-In Outrage, Continuous Data Processing, Lisp Processors, and Anomaly Detection

  1. The Toxoplasma of RageIt’s in activists’ interests to destroy their own causes by focusing on the most controversial cases and principles, the ones that muddy the waters and make people oppose them out of spite. And it’s in the media’s interest to help them and egg them on.
  2. Samza: LinkedIn’s Stream-Processing EngineSamza’s goal is to provide a lightweight framework for continuous data processing. Unlike batch processing systems such as Hadoop, which typically has high-latency responses (sometimes hours), Samza continuously computes results as data arrives, which makes sub-second response times possible.
  3. Design of LISP-Based Processors (PDF) — 1979 MIT AI Lab memo on design of hardware specifically for Lisp. Legendary subtitle! LAMBDA: The Ultimate Opcode.
  4. rAnomalyDetection — Twitter’s R package for detecting anomalies in time-series data. (via Twitter Engineering blog)
Comment