"scale" entries

Four short links: 26 November 2014

Metastable Failures, Static Python Analysis, Material Desktop, and AWS Scale Numbers

by Nat Torkington | @gnat | +Nat Torkington | November 26, 2014

Metastable Failure State (Facebook) — very nice story about working together to discover the cause of one of those persistently weird problems.
Bandit — static security analysis of Python code.
Quantum OS — Linux desktop based on Google’s Material Design. UI guidelines fascinate me: users love consistency, designers and brands hate that everything works the same.
Inside AWS — Every day, AWS installs enough server infrastructure to host the entire Amazon e-tailing business from back in 2004, when Amazon the retailer was one-tenth its current size at $7 billion in annual revenue. “What has changed in the last year,” Hamilton asked rhetorically, and then quipped: “We have done it 365 more times.” That is another way of saying that in the past year AWS has added enough capacity to support a $2.55 trillion online retailing operation, should one ever be allowed to exist.

Four short links: 3 September 2014

Distributed Systems Theory, Chinese Manufacturing, Quantified Infant, and Celebrity Data Theft

by Nat Torkington | @gnat | +Nat Torkington | September 3, 2014

Distributed Systems Theory for the Distributed Systems Engineer — I tried to come up with a list of what I consider the basic concepts that are applicable to my every-day job as a distributed systems engineer; what I consider ‘table stakes’ for distributed systems engineers competent enough to design a new system.
Shenzhen Trip Report (Joi Ito) — full of fascinating observations about how the balance of manufacturing strength has shifted in surprising ways. The retail price of the cheapest full featured phone is about $9. Yes. $9. This could not be designed in the US – this could only be designed by engineers with tooling grease under their fingernails who knew the manufacturing equipment inside and out, as well as the state of the art of high-end mobile phones.
Sproutling — The world’s first sensing, learning, predicting baby monitor. A wearable band for your baby, a smart charger and a mobile app work together to not only monitor more effectively but learn and predict your baby’s sleep habits and optimal sleep conditions. (via Wired)
Notes on the Celebrity Data Theft — wonderfully detailed analysis of how photos were lifted, and the underground industry built around them. This was one of the most unsettling aspects of these networks to me – knowing there are people out there who are turning over data on friends in their social networks in exchange for getting a dump of their private data.

Four short links: 25 August 2014

Digital Signs, Reverse Engineering Censorship, USB Protection, and Queue Software

by Nat Torkington | @gnat | +Nat Torkington | August 25, 2014

Greenscreen — Chromecast-based open source software for digital signs.
Reverse Engineering Censorship in Chinese Cyberspace (PDF) — researchers create accounts and probe to see which things are blocked. Empirical transparency.
USB Condom — A protective barrier between your device and “juice-jacking” hackers.
queues.io — long list of job queues, message queues, and other such implementations.

Four short links: 20 August 2014

Plant Properties, MQ Comparisons, 1915 Vis, and Mobile Web Weaknesses

by Nat Torkington | @gnat | +Nat Torkington | August 20, 2014

Machine Learning for Plant Properties — startup building database of plant genomics, properties, research, etc. for mining. The more familiar you are with your data and its meaning, the better your machine learning will be at suggesting fruitful lines of query … and the more valuable your startup will be.
Dissecting Message Queues — throughput, latency, and qualitative comparison of different message queues. MQs are to modern distributed architectures what function calls were to historic unibox architectures.
1915 Data Visualization Rules — a reminder that data visualization is not new, but research into effectiveness of alternative presentation styles is.
The Broken Promise of the Mobile Web — it’s not just about the UI – it’s also about integration with the mobile device.

Four short links: 18 August 2014

Space Trading, Robot Capitalism, Packet Injection, and CAP Theorem

by Nat Torkington | @gnat | +Nat Torkington | August 18, 2014

Oolite — open-source clone of Elite, the classic space trading game from the 80s.
Who Owns the Robots Rules The World (PDF) — interesting finding: As companies substitute machines and computers for human activity, workers need to own part of the capital stock that substitutes for them to benefit from these new “robot” technologies. Workers could own shares of the firm, hold stock options, or be paid in part from the profits. Without ownership stakes, workers will become serfs working on behalf of the robots’ overlords. Governments could tax the wealthy capital owners and redistribute income to workers, but that is not the direction societies are moving in. Workers need to own capital rather than rely on government income redistribution policies. (via Robotenomics)
Schrodinger’s Cat Video and the Death of Clear-Text (Morgan Marquis-Boire) — report, based on leaked information, about use of network injection appliances targeted unencrypted pages from major providers. Compromising a target becomes as simple as waiting for the user to view unencrypted content on the Internet.
CAP 12 Years Later: How the Rules Have Changed — a rundown of strategies available to deal with partitions (“outages”) in a distributed system.

Four short links: 13 August 2014

Thinking Machines, Chemical Sensor, Share Containerised Apps, and Visualising the Net Neutrality Comments

by Nat Torkington | @gnat | +Nat Torkington | August 13, 2014

Viv — another step in the cognition race. Wolfram Alpha was first out the gate, but Watson, Viv, and others are hot on heels of being able to parse complex requests, then seek and use information to fulfil them.
Universal Mobile Electrochemical Detector Designed for Use in Resource-limited Applications (PNAS) — $35 handheld sensor with mobile phone connection. The electrochemical methods that we demonstrate enable quantitative, broadly applicable, and inexpensive sensing with flexibility based on a wide variety of important electroanalytical techniques (chronoamperometry, cyclic voltammetry, differential pulse voltammetry, square wave voltammetry, and potentiometry), each with different uses. Four applications demonstrate the analytical performance of the device: these involve the detection of (i) glucose in the blood for personal health, (ii) trace heavy metals (lead, cadmium, and zinc) in water for in-field environmental monitoring, (iii) sodium in urine for clinical analysis, and (iv) a malarial antigen (Plasmodium falciparum histidine-rich protein 2) for clinical research. (via BoingBoing)
panamax.io — containerized app creator with an open-source app marketplace hosted in GitHub. Panamax provides a friendly interface for users of Docker, Fleet & CoreOS. With Panamax, you can easily create, share and deploy any containerized app no matter how complex it might be.
Quid Analysis of Comments to FCC on Net Neutrality (NPR) — visualising the themes and volume of the comments. Interesting factoid: only half the comments were derived from templates (cf 80% in submissions to some financial legislation).

Four short links: 11 August 2014

Startup Anthropology, Ends to Means, Permission to Test, and Distributed Systems Research

by Nat Torkington | @gnat | +Nat Torkington | August 11, 2014

Anthropology of Mid-Sized Startups — old but good post about the structures, norms, and dimensions of startup culture. Like a religion, a startup will care for its collective interest by defining certain things as sacred. A classic example is the company’s logo. This symbol is, quite literally, “set apart and forbidden” by brand guidelines, which often specify exactly how the logo must be presented and how far it should sit from the other elements on a page (thus separating the sacred from the profane).
What Leads To — I love the elegant mechanic of decomposing an end back to a means you can do right now. Lots more sophistication obviously possible, but the fact that it’s not just about “thumbs up this end!” or about actions divorced from intention, makes it a step ahead for social software.
Researching Link Rot (Pinboard) — graceful notification of a test, and with the simple ability to opt-out.
The Space Between Theory and Practice in Distributed Systems (Marc Brooker) — I went through everything I’ve read on distributed systems and arranged them on a spectrum from theory to practice the two ends would be really well populated, but the middle would be disturbingly empty. Worse, changing to a graph of citation links would show a low density from theory to practice.

Four short links: 6 August 2014

Mesa Database, Thumbstoppers, Impressive Research, and Microsoft Development

by Nat Torkington | @gnat | +Nat Torkington | August 6, 2014

Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing (PDF) — paper by Googlers on the database holding G’s ad data. Trillions of rows, petabytes of data, point queries with 99th percentile latency in the hundreds of milliseconds and overall query throughput of trillions of rows fetched per day, continuous updates on the order of millions of rows updated per second, strong consistency and repeatable query results even if a query involves multiple datacenters, and no SPOF. (via Greg Linden)
Thumbstopping (Salon) — The prime goal of a Facebook ad campaign is to create an ad “so compelling that it would get people to stop scrolling through their news feeds,” reports the Times. This is known, in Facebook land, as a “thumbstopper.” And thus, the great promise of the digitial revolution is realized: The best minds of our generation are obsessed with manipulating the movement of your thumb on a smartphone touch-screen.
om3d — pose a model based on its occurrence in a photo, then update the photo after rotating and re-rendering the model. Research is doing some sweet things these days—this comes hot on the heels of recovering sounds from high-speed video of things like chip bags.
Microsoft’s Development Practices (Ars Technica) — they get the devops religion but call it “combined engineering”. They get the idea of shared code bases, but call it “open source”. At least when they got the agile religion, they called it that. Check out the horror story of where they started: a two-year development process in which only about four months would be spent writing new code. Twice as long would be spent fixing that code. MSFT’s waterfall was the equivalent of American football, where there’s 11 minutes of actual play in the average 3h 12m game.

Four short links: 5 August 2014

Discussion Graph Tool, Superlinear Productivity, Go Concurrency, and R Map/Reduce Tools

by Nat Torkington | @gnat | +Nat Torkington | August 5, 2014

Discussion Graph Tool (Microsoft Research) — simplifies social media analysis by making it easy to extract high-level features and co-occurrence relationships from raw data.
Superlinear Productivity in Collective Group Actions (PLoS ONE) — study of open source projects shows small groups exhibit non-linear productivity increases by size, which drop off at larger sizes. we document a size effect in the strength and variability of the superlinear effect, with smaller groups exhibiting widely distributed superlinear exponents, some of them characterizing highly productive teams. In contrast, large groups tend to have a smaller superlinearity and less variability.
coop — cheat sheet of the most common concurrency program flows in Go.
Tessera — set of open source tools around Hadoop, R, and visualization.

Four short links: 4 August 2014

Web Spreadsheet, Correlated Novelty, A/B Ethics, and Replicated Data Structures

by Nat Torkington | @gnat | +Nat Torkington | August 4, 2014

EtherCalc — open source web-based spreadsheet.
Dynamics of Correlated Novelties (Nature) — paper on “the adjacent possible”. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya’s urn, predicts statistical laws for the rate at which novelties happen (Heaps’ law) and for the probability distribution on the space explored (Zipf’s law), as well as signatures of the process by which one novelty sets the stage for another. (via Steven Strogatz)
On The Media Interview with OKCupid CEO — relevant to the debate on ethics of A/B tests. I preferred this to Tim Carmody’s rant.
CRDTs as Alternative to APIs — when using CRDTs to tie your system together, you don’t need to resort to using impoverished representations that simply never come anywhere near the representational power of the data structures you use in your programs at runtime. See also this paper on Convergent and Commutative Replicated Data Types.