"research" entries

Four short links: 7 January 2015

Four short links: 7 January 2015

Program Synthesis, Data Culture, Metrics, and Information Biology

  1. Program Synthesis ExplainedThe promise of program synthesis is that programmers can stop telling computers how to do things, and focus instead on telling them what they want to do. Inductive program synthesis tackles this problem with fairly vague specifications and, although many of the algorithms seem intractable, in practice they work remarkably well.
  2. Creating a Data-Driven Culture — new (free!) ebook from Hilary Mason and DJ Patil. The editor of that team is the luckiest human being alive.
  3. Ev Williams on Metrics — a master-class in how to think about and measure what matters. If what you care about — or are trying to report on — is impact on the world, it all gets very slippery. You’re not measuring a rectangle, you’re measuring a multi-dimensional space. You have to accept that things are very imperfectly measured and just try to learn as much as you can from multiple metrics and anecdotes.
  4. Nature, the IT Wizard (Nautilus) — a fun walk through the connections between information theory, computation, and biology.
Comment
Four short links: 2 January 2015

Four short links: 2 January 2015

Privacy Philosophy, Bitcoin Risks, Modelling Emotion, and Opinion Formation

  1. Google’s Philosopher — interesting take on privacy. Now that the mining and manipulation of personal information has spread to almost all aspects of life, for instance, one of the most common such questions is, “Who owns your data?” According to Floridi, it’s a misguided query. Your personal information, he argues, should be considered as much a part of you as, say, your left arm. “Anything done to your information,” he has written, “is done to you, not to your belongings.” Identity theft and invasions of privacy thus become more akin to kidnapping than stealing or trespassing. Informational privacy is “a fundamental and inalienable right,” he argues, one that can’t be overridden by concerns about national security, say, or public safety. “Any society (even a utopian one) in which no informational privacy is possible,” he has written, “is one in which no personal identity can be maintained.”
  2. S-1 for a Bitcoin Trust (SEC) — always interesting to read through the risks list to see what’s there and what’s not.
  3. Computationally Modelling Human Emotion (ACM) — our work seeks to create true synergies between computational and psychological approaches to understanding emotion. We are not satisfied simply to show our models “fit” human data but rather seek to show they are generative in the sense of producing new insights or novel predictions that can inform understanding. From this perspective, computational models are simply theories, albeit more concrete ones that afford a level of hypothesis generation and experimentation difficult to achieve through traditional theories.
  4. Opinion Formation Models on a Gradient (PLoSONE) — Many opinion formation models embedded in two-dimensional space have only one stable solution, namely complete consensus, in particular when they implement deterministic rules. In reality, however, deterministic social behavior and perfect agreement are rare – at least one small village of indomitable Gauls always holds out against the Romans. […] In this article we tackle the open question: can opinion dynamics, with or without a stochastic element, fundamentally alter percolation properties such as the clusters’ fractal dimensions or the cluster size distribution? We show that in many cases we retrieve the scaling laws of independent percolation. Moreover, we also give one example where a slight change of the dynamic rules leads to a radically different scaling behavior.
Comment
Four short links: 24 December 2014

Four short links: 24 December 2014

DRMed Objects, Eventual Consistency, Complex Systems, and Machine Learning Papers

  1. DRMed Cat Litter Box — the future is when you don’t own what you buy, and it’s illegal to make it work better. (via BoingBoing)
  2. Are We Consistent Yet? — the eventuality of consistency on different cloud platforms.
  3. How Complex Systems Fail (YouTube) — Richard Cook’s Velocity 2012 keynote.
  4. Interesting papers from NIPS 2014 — machine learning holiday reading.
Comment
Four short links: 23 December 2014

Four short links: 23 December 2014

Useful Metrics, Trouble at Mill, Drug R&D, and Disruptive Opportunities

  1. Metrics for Operational Performance — you’d be surprised how many places around your business you can meaningfully and productively track time-to-detection and time-to-resolution.
  2. Steel Mill Hacked — damage includes a blast furnace that couldn’t be shut down properly.
  3. Cerebros — drug-smuggling’s equivalent of corporate R&D. (via Regine Debatty)
  4. Ramble About Bitcoin (Matt Webb) — the meta I’m trying to figure out is: when you spot that one of these deep value chains is at the beginning of a big reconfiguration, what do you do? How do you enter it as a small business? How, as a national economy, do you help it along and make sure the transition happens healthily?
Comment
Four short links: 8 December 2014

Four short links: 8 December 2014

Systemic Improvement, Chinese Trends, Deep Learning, and Technical Debt

  1. Reith Lectures — this year’s lectures are by Atul Gawande, talking about preventable failure and systemic improvement — topics of particular relevance to devops cultural devotees. (via BoingBoing)
  2. Chinese Mobile App UI Trends — interesting differences between US and China. Phone number authentication interested me: You key in your number and receive a confirmation code via SMS. Here, all apps offer this type of phone number registration/login (if not prefer it). This also applies to websites, even those without apps. (via Matt Webb)
  3. Large Scale Deep Learning (PDF) — Jeff Dean from Google. Starts easy! Starts.
  4. Machine Learning: The High-Interest Credit Card of Technical Debt (PDF) — Google research paper on the ways in which machine learning can create problems rather than solve them.
Comment: 1
Four short links: 25 November 2014

Four short links: 25 November 2014

NSA Playset, Open Access, XSS Framework, and Security Test Cases

  1. Michael Ossman and the NSA Playset — the guy who read the leaked descriptions of the NSA’s toolchest, built them, and open sourced the designs. One device, dubbed TWILIGHTVEGETABLE, is a knock off of an NSA-built GSM cell phone that’s designed to sniff and monitor Internet traffic. The ANT catalog lists it for $15,000; the NSA Playset researchers built one using a USB flash drive, a cheap SDR, and an antenna, for about $50. The most expensive device, a drone that spies on WiFi traffic called PORCUPINEMASQUERADE, costs about $600 to assemble. At Defcon, a complete NSA Playset toolkit was auctioned by the EFF for $2,250.
  2. Gates Foundation Announces World’s Strongest Policy on Open Access Research (Nature) — Once made open, papers must be published under a license that legally allows unrestricted re-use — including for commercial purposes. This might include ‘mining’ the text with computer software to draw conclusions and mix it with other work, distributing translations of the text, or selling republished versions. CC-BY! We believe that published research resulting from our funding should be promptly and broadly disseminated.
  3. Xenotixan advanced Cross Site Scripting (XSS) vulnerability detection and exploitation framework. It provides Zero False Positive scan results with its unique Triple Browser Engine (Trident, WebKit, and Gecko) embedded scanner. It is claimed to have the world’s 2nd largest XSS Payloads of about 4700+ distinctive XSS Payloads for effective XSS vulnerability detection and WAF Bypass. Xenotix Scripting Engine allows you to create custom test cases and addons over the Xenotix API. It is incorporated with a feature-rich Information Gathering module for target Reconnaissance. The Exploit Framework includes offensive XSS exploitation modules for Penetration Testing and Proof of Concept creation.
  4. Firing Range — Google’s open source set of web security test cases for scanners.
Comment
Four short links: 13 November 2014

Four short links: 13 November 2014

Material Design, GitHub Communication, Priority Queues, and DevOps Learnings

  1. Materialize — another web implementation of Material Design.
  2. Communicating at Github — interesting take on making visible and optimising for the conversations and decisions that form culture but are otherwise invisible.
  3. MultiQueues — an approach for parallel access to priority queues.
  4. Devops LearningsWe view DevOps as the missing components of agile – the enabler for getting it out of the door and closing the loop between software engineer and customer.
Comment
Four short links: 10 November 2014

Four short links: 10 November 2014

Metascience, Bio Fab, Real-time Emoji, and Phone Library

  1. Metascience Could Rescue the Replication Crisis (Nature) — Metascience, the science of science, uses rigorous methods to examine how scientific practices influence the validity of scientific conclusions. (via Ed Yong)
  2. OpenTrons (Kickstarter) — 3d-printer style frame for micropipetting, magnetic micro-bead washes, and photography. Open source and kickstarterated. (via Evil Mad Scientist)
  3. Emoji Tracker — real-time emoji use across Twitter. (via Chris Aniszczyk)
  4. libphonenumber — open source Google’s common Java, C++ and Javascript library for parsing, formatting, storing and validating international phone numbers. The Java version is optimized for running on smartphones, and is used by the Android framework since 4.0 (Ice Cream Sandwich).
Comment
Four short links: 22 October 2014

Four short links: 22 October 2014

Docker Patterns, Better Research, Streaming Framework, and Data Science Textbook

  1. Eight Docker Development Patterns (Vidar Hokstad) — patterns for creating repeatable builds that result in as-static-as-possible server environments.
  2. How to Make More Published Research True (PLOSmedicine) — overview of efforts, and research on those efforts, to raise the proportion of published research which is true.
  3. Gearpump — Intel’s “actor-driven streaming framework”, initial benchmarks shows that we can process 2 million messages/second (100 bytes per message) with latency around 30ms on a cluster of 4 nodes.
  4. Foundations of Data Science (PDF) — These notes are a first draft of a book being written by Hopcroft and Kannan [of Microsoft Research] and in many places are incomplete. However, the notes are in good enough shape to prepare lectures for a modern theoretical course in computer science.
Comment
Four short links: 20 October 2014

Four short links: 20 October 2014

Leaky Search, Conditional Javascript, Software Proofs, and Fake Identity

  1. Fix Mac OS Xeach time you start typing in Spotlight (to open an application or search for a file on your computer), your local search terms and location are sent to Apple and third parties (including Microsoft) under default settings on Yosemite (10.10). See also Net Monitor, an open source toolkit for finding phone-home behaviour.
  2. A/B Testing at Netflix (ACM) — Using a combination of static analysis to build a dependency tree, which is then consumed at request time to resolve conditional dependencies, we’re able to build customized payloads for the millions of unique experiences across Netflix.com.
  3. Leslie Lamport Interview SummaryOne idea about formal specifications that Lamport tries to dispel is that they require mathematical capabilities that are not available to programmers: “The mathematics that you need in order to write specifications is a lot simpler than any programming language […] Anyone who can write C code, should have no trouble understanding simple math, because C code is a hell of a lot more complicated than” first-order logic, sets, and functions. When I was at uni, profs worked on distributed data, distributed computation, and formal correctness. We have the first two, but so much flawed software that I can only dream of the third arriving.
  4. Fake Identity — generate fake identity data when testing systems.
Comment