"math" entries

Four short links: 3 June 2015

Filter Design, Real-Time Analytics, Neural Turing Machines, and Evaluating Subjective Opinions

How to Design Applied Filters — The most frequently observed issue during usability testing were filtering values changing placement when the user applied them – either to another position in the list of filtering values (typically the top) or to an “Applied filters” summary overview. During testing, the subjects were often confounded as they noticed that the filtering value they just clicked was suddenly “no longer there.”
Twitter Heron — a real-time analytics platform that is fully API-compatible with Storm […] At Twitter, Heron is used as our primary streaming system, running hundreds of development and production topologies. Since Heron is efficient in terms of resource usage, after migrating all Twitter’s topologies to it we’ve seen an overall 3x reduction in hardware, causing a significant improvement in our infrastructure efficiency.
ntm — an implementation of neural Turing machines. (via @fastml_extra)
Bayesian Truth Serum — a scoring system for eliciting and evaluating subjective opinions from a group of respondents, in situations where the user of the method has no independent means of evaluating respondents’ honesty or their ability. It leverages respondents’ predictions about how other respondents will answer the same questions. Through these predictions, respondents reveal their meta-knowledge, which is knowledge of what other people know.

Four short links: 1 June 2015

AI Drives, Decent Screencaps, HTTP/2 Antipatterns, Time Series

by Nat Torkington | @gnat | +Nat Torkington | June 1, 2015

The Basic AI Drives (PDF) — Surely, no harm could come from building a chess-playing robot, could it? In this paper, we argue that such a robot will indeed be dangerous unless it is designed very carefully. Without special precautions, it will resist being turned off, will try to break into other machines and make copies of itself, and will try to acquire resources without regard for anyone else’s safety. These potentially harmful behaviors will occur not because they were programmed in at the start, but because of the intrinsic nature of goal-driven systems.
PreTTY — how to take a good-looking screencap of your terminal app in action.
Why Some of Yesterday’s HTTP Best Practices are HTTP/2 Antipatterns — also functions as an overview of HTTP/2 for those of us who didn’t keep up with the standardization efforts.
Tisean — a software project for the analysis of time series with methods based on the theory of nonlinear deterministic dynamical systems. (via @aphyr)

Four short links: 11 May 2015

Age of Infrastructure, Facial Expressions, Proof Assistants, and Programmer Talent

by Nat Torkington | @gnat | +Nat Torkington | May 11, 2015

Welcome to the Age of Infrastructure (Annalee Newitz) — The Internet isn’t that thing in there, inside your little glowing box. It’s in your washing machine, kitchen appliances, pet feeder, your internal organs, your car, your streets, the very walls of your house. You use your wearable to interface with the world out there.
Facial Performance Sensing Head-Mounted Display (YouTube) — glorious use of an Oculus headset, to capture (for reproduction on an avatar) fine-grained facial expressions. From SIGGRAPH 2015.
Mathematical Proof Assistants — human augmentation in mathematics.
The Programmer Talent Myth (LWN) — Jacob Kaplan-Moss on the distribution of programmer talent and the damage that the bimodal myth causes.

Four short links: 30 April 2015

Managing Complex Data Projects, Graphical Linear Algebra, Consistent Hashing, and NoTCP Manifesto

by Nat Torkington | @gnat | +Nat Torkington | April 30, 2015

More Tools for Managing and Reproducing Complex Data Projects (Ben Lorica) — As I survey the landscape, the types of tools remain the same, but interfaces continue to improve, and domain specific languages (DSLs) are starting to appear in the context of data projects. One interesting trend is that popular user interface models are being adapted to different sets of data professionals (e.g. workflow tools for business users).
Graphical Linear Algebra — or “Graphical The-Subject-That-Kicked-Nat’s-Butt” as I read it.
Consistent Hashing: A Guide and Go Implementation — easy-to-follow article (and source).
NoTCP Manifesto — a nice summary of the reasons to build custom protocols over UDP, masquerading as church-nailed heresy. Today’s heresy is just the larval stage of tomorrow’s constricting orthodoxy.

Four short links: 14 April 2015

Technical Debt, A/A Testing, NSA's Latest, and John von Neumann

by Nat Torkington | @gnat | +Nat Torkington | April 14, 2015

Pycon 2015: Technical Debt, The Monster in Your Closet (YouTube) — excellent talk from PyCon. See also slides.
A/A Testing — In an A/A test, you run a test using the exact same options for both “variants” in your test. That’s right, there’s no difference between “A” and “B” in an A/A test. It sounds stupid, until you see the “results.” (via Nelson Minar)
NSA Declares War on General-Purpose Computing (BoingBoing) — NSA director Michael S Rogers says his agency wants “front doors” to all cryptography used in the USA, so that no one can have secrets it can’t spy on — but what he really means is that he wants to be in charge of which software can run on any general purpose computer.
John von Neumann Documentary (YouTube) — 1966 documentary from the American Mathematical Association on the father of digital computing, who also is hailed as the father of game theory and much much more. (via Paul Walker)

Four short links: 7 April 2015

JavaScript Numeric Methods, Misunderstood Statistics, Web Speed, and Sentiment Analysis

by Nat Torkington | @gnat | +Nat Torkington | April 7, 2015

NumericJS — numerical methods in JavaScript.
P Values are not Error Probabilities (PDF) — In particular, we illustrate how this mixing of statistical testing methodologies has resulted in widespread confusion over the interpretation of p values (evidential measures) and α levels (measures of error). We demonstrate that this confusion was a problem between the Fisherian and Neyman–Pearson camps, is not uncommon among statisticians, is prevalent in statistics textbooks, and is well nigh universal in the pages of leading (marketing) journals. This mass confusion, in turn, has rendered applications of classical statistical testing all but meaningless among applied researchers.
Breaking the 1000ms Time to Glass Mobile Barrier (YouTube) —
See also slides. Stay under 250 ms to feel “fast.” Stay under 1000 ms to keep users’ attention.
Modern Methods for Sentiment Analysis — Recently, Google developed a method called Word2Vec that captures the context of words, while at the same time reducing the size of the data. Gentle introduction, with code.

Four short links: 18 February 2015

Sales Automation, Clone Boxes, Stats Style, and Extra Orifices

by Nat Torkington | @gnat | +Nat Torkington | February 18, 2015

Systematising Sales with Software and Processes — sweet use of Slack as UI for sales tools.
Duplicate SSH Keys Everywhere — It looks like all devices with the fingerprint are Dropbear SSH instances that have been deployed by Telefonica de Espana. It appears that some of their networking equipment comes set up with SSH by default, and the manufacturer decided to reuse the same operating system image across all devices.
Style.ONS — UK govt style guide covers the elements of writing about statistics. It aims to make statistical content more open and understandable, based on editorial research and best practice. (via Hadley Beeman)
Warren Ellis on the Apple Watch — I, personally, want to put a gold chain on my phone, pop it into a waistcoat pocket, and refer to it as my “digital fob watch” whenever I check the time on it. Just to make the point in as snotty and high-handed a way as possible: This is the decadent end of the current innovation cycle, the part where people stop having new ideas and start adding filigree and extra orifices to the stuff we’ve got and call it the future.

Four short links: 19 December 2014

Statistical Causality, Clustering Bitcoin, Hardware Security, and A Language for Scripts

by Nat Torkington | @gnat | +Nat Torkington | December 19, 2014

Distinguishing Cause and Effect using Observational Data — research paper evaluating effectiveness of the “additive noise” test, a nifty statistical trick to identify causal relationships from observational data. (via Slashdot)
Clustering Bitcoin Accounts Using Heuristics (O’Reilly Radar) — In theory, a user can go by many different pseudonyms. If that user is careful and keeps the activity of those different pseudonyms separate, completely distinct from one another, then they can really maintain a level of, maybe not anonymity, but again, cryptographically it’s called pseudo-anonymity. […] It turns out in reality, though, the way most users and services are using bitcoin, was really not following any of the guidelines that you would need to follow in order to achieve this notion of pseudo-anonymity. So, basically, what we were able to do is develop certain heuristics for clustering together different public keys, or different pseudonyms.
A Primer on Hardware Security: Models, Methods, and Metrics (PDF) — Camouflaging: This is a layout-level technique to hamper image-processing-based extraction of gate-level netlist. In one embodiment of camouflaging, the layouts of standard cells are designed to look alike, resulting in incorrect extraction of the netlist. The layout of nand cell and the layout of nor cell look different and hence their functionality can be extracted. However, the layout of a camouflaged nand cell and the layout of camouflaged nor cell can be made to look identical and hence an attacker cannot unambiguously extract their functionality.
Prompter: A Domain-Specific Language for Versu (PDF) — literally a scripting language (you write theatrical-style scripts, characters, dialogues, and events) for an inference engine that lets you talk to characters and have a different story play out each time.

Four short links: 11 September 2014

Win98 Retro, Glass as Sensor, Theoretical CS, and Code Search

by Nat Torkington | @gnat | +Nat Torkington | September 11, 2014

windows_98.css — the compelling new look that’s sweeping the world all over again.
BioGlass (MIT) — use Glass’s accelerometer, gyroscope, and camera to extract pulse and respiratory rates. (via MIT Tech Review)
Building Blocks for Theoretical Computer Science — free online textbook covering what I lovingly think of as “the mathy bits of computing that are so damn hard”.
The Platinum Searcher — code search tool similar to ack and ag. It supports multi platforms and multi encodings. Written in go, and is fast.

Four short links: 14 August 2014

Ceramic 3D Printing, Robo Proofs, Microservice Fail, and Amazing Graphics Tweaks

by Nat Torkington | @gnat | +Nat Torkington | August 14, 2014

$700 Ceramic-Spitting 3D Printer (Make Magazine) — ceramic printing is super interesting, not least because it doesn’t fill the world with plastic glitchy bobbleheads.
Mathematics in the Age of the Turing Machine (Arxiv) — a survey of mathematical proofs that rely on computer calculations and formal proofs. (via Victoria Stodden)
Failing at Microservices — deconstructed a failed stab at microservices. Category three engineers also presented a significant problem to our implementation. In many cases, these engineers implemented services incorrectly; in one example, an engineer had literally wrapped and hosted one microservice within another because he didn’t understand how the services were supposed to communicate if they were in separate processes (or on separate machines). These engineers also had a tough time understanding how services should be tested, deployed, and monitored because they were so used to the traditional “throw the service over the fence”to an admin approach to deployment. This basically lead to huge amounts of churn and loss of productivity.
Transient Attributes for High-Level Understanding and Editing of Outdoor Scenes — computer vision doing more amazing things: annotate scenes (e.g., sunsets, seasons), train, then be able to adjust images. Tweak how much sunset there is in your pic? Wow.