"distributed systems" entries

Four short links: 29 July 2015

Four short links: 29 July 2015

Mobile Medical Scanner, Amazon Hardware Showcase, Consistency Challenges, and Govt Alpha Geeks

  1. Cellphone-Based Hand-Held Microplate Reader for Point-of-Care Testing of Enzyme-Linked Immunosorbent Assayswe created a hand-held and cost-effective cellphone-based colorimetric microplate reader that implements a routine hospital test used to identify HIV and other conditions. (via RtoZ)
  2. Amazon Launchpad — a showcase for new hardware startups, who might well be worried about Amazon’s “watch what sells and sell a generic version of it” business model.
  3. Challenges to Adopting Stronger Consistency at Scale (PDF) — It is not obvious that a system that trades stronger consistency for increased latency or reduced availability would be a net benefit to people using Facebook, especially when compared against a weakly consistent system that resolves many inconsistencies with ad hoc mechanisms.
  4. The White House’s Alpha Geeks — Megan Smith for President. I realize now there’s two things we techies should do — one is go where there are lots of us, like MIT or Silicon Valley or whatever, because you can move really fast and do extraordinary things. The other is, go where you’re rare.It’s almost like you’re a frog in boiling water; you don’t really realize how un-diverse it is until you’re in a normal diverse American innovative community like the President’s team. And then you go back and you’re like, wow. You feel, “Man, this industry is so awesome and yet we’re missing all of this talent.”
Comment
Four short links: 21 July 2015

Four short links: 21 July 2015

Web Future, GCE vs Amazon, Scammy eBooks, and Container Clusters

  1. Web Design: The First 100 Years (Maciej Ceglowski) — There’s a William Gibson quote that Tim O’Reilly likes to repeat: “the future is here; it’s just not evenly distributed yet.” O’Reilly takes this to mean that if we surround ourselves with the right people, it can give us a sneak peek at coming attractions. I like to interpret this quote differently, as a call to action. Rather than waiting passively for technology to change the world, let’s see how much we can do with what we already have. Let’s reclaim the Web from technologists who tell us that the future they’ve imagined is inevitable, and that our role in it is as consumers.
  2. Comparing Cassandra Write Performance on Google Compute Engine and AWStl;dr – We achieved better Cassandra performance on GCE vs. Amazon, at close to half the cost. Also interesting for how they built the benchmark.
  3. The Scammy Underground World of Kindle eBooksThe biggest issue here isn’t that scammers are raking in cash from low-quality content; it’s that Amazon is allowing this to happen. Publisher brand value is the reliable expectation that buyers have of the book quality. Amazon’s publishing arm is spending the good brand value built by its distribution arm.
  4. Empire a 12-factor-compatible, Docker-based container cluster built on top of Amazon’s robust EC2 Container Service (ECS), complete with a full-featured command line interface. Open source.
Comment: 1
Four short links: 17 July 2015

Four short links: 17 July 2015

Smalltalky Web, Arduino Speech, Testing Distributed Systems, and Dataflow for FP

  1. Project Journal: Objects (Ian Bicking) — a view askew at the Web, inspired by Alan Kay’s History of Smalltalk.
  2. Speech Recognition for Arduino (Kickstarter) — for all your creepy toy hacking needs!
  3. Conductor (github) — a framework for testing distributed systems.
  4. Dataflow Syntax for Functional Programming? — two great tastes that will make your head hurt together!
Comment
Four short links: 6 July 2015

Four short links: 6 July 2015

DeepDream, In-Flight WiFi, Computer Vision in Preservation, and Testing Distributed Systems

  1. DeepDream — the software that’s been giving the Internet acid-free trips.
  2. In-Flight WiFi Business — numbers and context for why some airlines (JetBlue) have fast free in-flight wifi while others (Delta) have pricey slow in-flight wifi. Four years ago ViaSat-1 went into geostationary orbit, putting all other broadband satellites to shame with 140 Gbps of total capacity. This is the Ka-band satellite that JetBlue’s fleet connects to, and while the airline has to share that bandwidth with homes across of North America that subscribe to ViaSat’s Excede residential broadband service, it faces no shortage of capacity. That’s why JetBlue is able to deliver 10-15 Mbps speeds to its passengers.
  3. British Library Digitising Newspapers (The Guardian) — as well as photogrammetry methods used in the Great Parchment Book project, Terras and colleagues are exploring the potential of a host of techniques, including multispectral imaging (MSI). Inks, pencil marks, and paper all reflect, absorb, or emit particular wavelengths of light, ranging from the infrared end of the electromagnetic spectrum, through the visible region and into the UV. By taking photographs using different light sources and filters, it is possible to generate a suite of images. “We get back this stack of about 40 images of the [document] and then we can use image-processing to try to see what is in [some of them] and not others,” Terras explains.
  4. Testing a Distributed System (ACM) — This article discusses general strategies for testing distributed systems as well as specific strategies for testing distributed data storage systems.
Comment
Four short links: 30 June 2015

Four short links: 30 June 2015

Ductile Systems, Accessibility Testing, Load Testing, and CRAP Data

  1. Brittle SystemsMore than two decades ago at Sun, I was convinced that making systems ductile (the opposite of brittle) was the hardest and most important problem in system engineering.
  2. tota11y — accessibility testing toolkit from Khan.
  3. Locustan open source load testing tool.
  4. Impala: a Modern, Open-source SQL Engine for Hadoop (PDF) — CRAP, aka Create, Read, and APpend, as coined by an ex-colleague at VMware, Charles Fan (note the absence of update and delete capabilities). (via A Paper a Day)
Comment
Four short links: 9 June 2015

Four short links: 9 June 2015

Parallelising Without Coordination, AR/VR IxD, Medical Insecurity, and Online Privacy Lies

  1. The Declarative Imperative (Morning Paper) — on Dataflow. …a large class of recursive programs – all of basic Datalog – can be parallelized without any need for coordination. As a side note, this insight appears to have eluded the MapReduce community, where join is necessarily a blocking operator.
  2. Consensual Reality (Alistair Croll) — Among other things we discussed what Inbar calls his three rules for augmented reality design: 1. The content you see has to emerge from the real world and relate to it. 2. Should not distract you from the real world; must add to it. 3. Don’t use it when you don’t need it. If a film is better on the TV watch the TV.
  3. X-Rays Behaving BadlyAccording to the report, medical devices – in particular so-called picture archive and communications systems (PACS) radiologic imaging systems – are all but invisible to security monitoring systems and provide a ready platform for malware infections to lurk on hospital networks, and for malicious actors to launch attacks on other, high value IT assets. Among the revelations contained in the report: A malware infection at a TrapX customer site spread from a unmonitored PACS system to a key nurse’s workstation. The result: confidential hospital data was secreted off the network to a server hosted in Guiyang, China. Communications went out encrypted using port 443 (SSL) and were not detected by existing cyber defense software, so TrapX said it is unsure how many records may have been stolen.
  4. The Online Privacy Lie is Unraveling (TechCrunch) — The report authors’ argue it’s this sense of resignation that is resulting in data tradeoffs taking place — rather than consumers performing careful cost-benefit analysis to weigh up the pros and cons of giving up their data (as marketers try to claim). They also found that where consumers were most informed about marketing practices they were also more likely to be resigned to not being able to do anything to prevent their data being harvested. Something that didn’t make me regret clicking on a TechCrunch link.
Comment
Four short links: 27 May 2015

Four short links: 27 May 2015

Domo Arigato Mr Google, Distributed Graph Processing, Experiencing Ethics, and Deep Learning Robots

  1. Roboto — Google’s signature font is open sourced (Apache 2.0), including the toolchain to build it.
  2. Pregel: A System for Large Scale Graph Processing — a walk through a key 2010 paper from Google, on the distributed graph system that is the inspiration for Apache Giraph and which sits under PageRank.
  3. How to Turn a Liberal Hipster into a Global Capitalist (The Guardian) — In Zoe Svendsen’s play “World Factory at the Young Vic,” the audience becomes the cast. Sixteen teams sit around factory desks playing out a carefully constructed game that requires you to run a clothing factory in China. How to deal with a troublemaker? How to dupe the buyers from ethical retail brands? What to do about the ever-present problem of clients that do not pay? […] And because the theatre captures data on every choice by every team, for every performance, I know we were not alone. The aggregated flowchart reveals that every audience, on every night, veers toward money and away from ethics. I’m a firm believer that games can give you visceral experience, not merely intellectual knowledge, of an activity. Interesting to see it applied so effectively to business.
  4. End to End Training of Deep Visuomotor Policies (PDF) — paper on using deep learning to teach robots how to manipulate objects, by example.
Comment
Four short links: 15 May 2015

Four short links: 15 May 2015

Army Cloud, Google Curriculum, Immutable Infrastructure, and Task Queues

  1. Army Cloud Computing Strategy (PDF) — aka: “what we hope to do without having done, to use what we’re doing to them.”
  2. Guide to Technical Development (Google) — This guide is a suggested path for university students to develop their technical skills academically and non-academically through self-paced, hands-on learning.
  3. Immutable Infrastructure is the Future (Michael DeHaan) — The future of configuration management systems is in deploying cloud infrastructure that will later run immutable systems via an API level.
  4. machineryan asynchronous task queue/job queue based on distributed message passing.
Comment
Four short links: 14 May 2015

Four short links: 14 May 2015

Human-Machine Cooperation, Concurrent Systems Books, AI Future, and Gesture UI

  1. Ghosts in the Machines (Courtney Nash) — People are neither masters of machines, nor subservient to their machine-learning outcomes — we cannot, and should not, separate the two. We are actors, together, in a very complex system. David Woods calls this “joint cognitive systems.”
  2. TLA+ (Leslie Lamport) — two tutorials: “Principles of Concurrent Computing” and “Specification of Concurrent Systems.” Ironically, I see people grizzling that the book on distributed systems hasn’t been linearised. I wonder if you can partition it into the two tutorials and still have full availability…
  3. Deep Learning vs Probabilistic vs LogicAs of 2015, I pity the fool who prefers Modus Ponens over Gradient Descent.
  4. Touché (Disney Research) — measur[es] capacitive response of object and human at multiple frequencies, a technique that we called Swept Frequency Capacitive Sensing. The signal travels through different paths depending on its frequency, capturing the posture of human hand and body as well as other properties of the context. The resulted data is classified using machine learning algorithms to identify gestures that are then used to trigger desired responses of the user interface.
Comment
Four short links: 6 May 2015

Four short links: 6 May 2015

Self-Driving Cars, Cloud BigTable, Define "Uptime," and Continuous Delivery Architectures

  1. Andrew Ng (Wired) — I think self-driving cars are a little further out than most people think. There’s a debate about which one of two universes we’re in. In the first universe it’s an incremental path to self-driving cars, meaning you have cruise control, adaptive cruise control, then self-driving cars only on the highways, and you keep adding stuff until 20 years from now you have a self-driving car. In universe two you have one organization, maybe Carnegie Mellon or Google, that invents a self-driving car and bam! You have self-driving cars. It wasn’t available Tuesday but it’s on sale on Wednesday. I’m in universe one. I think there’s a lot of confusion about how easy it is to do self-driving cars. There’s a big difference between being able to drive a thousand miles, versus being able to drive anywhere. And it turns out that machine-learning technology is good at pushing performance from 90 to 99 percent accuracy. But it’s challenging to get to four nines (99.99 percent). I’ll give you this: we’re firmly on our way to being safer than a drunk driver.
  2. Google Cloud BigTable — Google’s BigTable, with Apache HBase API, single-digit millisecond latency, and “fully managed”. G are hell-bent on catching up with Amazon and Microsoft at this cloud serving thing.
  3. Call Me Maybe: AerospikeWe’re setting a timeout of 500ms here, and operations still time out every time a partition between nodes occurs. In these tests we aren’t interfering with client-server traffic at all. Aerospike may claim “100% uptime”, but this is only meaningful with respect to particular latency bounds. Given Aerospike claims millisecond-scale latencies, you may want to reconsider whether you consider this “uptime”.
  4. 31 Continuous Delivery Architectures (Slideshare) — from a vendor, so one name crops up repeatedly (other than “Jenkins”), but it’s still good devops voyeurism/envy.
Comment