Dragon: A Distributed Graph Query Engine — Facebook describes its internal graph query engine. [T]he layout of these indices on storage is optimized based on a deeper understanding of query patterns (e.g., many queries are about friends), as opposed to accepting random sharding, which is common in these systems. Wisely, the system is tailored to the use cases they have and the patterns they see in access.
Almost Everyone Is Doing the API Economy Wrong (Techcrunch) — Redux: your API should help you make money when the API customer makes money, and you should set clear expectations for what’s acceptable and what’s not. But every developer should be forced to write 100 times: “if you build on a platform you don’t own, you’re building on a potential and probable future competitor.”
Traditional Economics Failed, Here’s a Blueprint — runs through the shifts happening in our thinking about the world and ourselves (simple to complex, independent to interdependent, rational calculator to irrational approximators, etc) and concludes: True self-interest is mutual interest. The best way to improve your likelihood of surviving and thriving is to make sure those around you survive and thrive. See above API note.
Blitzscaling (HBR) — as you move from village to city, functions are beginning to be differentiated; you’re really multithreading. I could write a thesis on the CAP theorem for business. And I have definitely worked for companies that have a “share nothing” approach to solving their threading issues.
HCI Pioneers — Ben Schneiderman’s photo collection, acknowledging pioneers in the field. (via CCC Blog)
A Burglar’s Guide to the City (BLDGBLOG) — For the past several years, I’ve been writing a book about the relationship between burglary and architecture. Burglary, as it happens, requires architecture: it is a spatial crime. Without buildings, burglary, in its current legal form, could not exist. Committing it requires an inside and an outside; it’s impossible without boundaries, thresholds, windows, and walls. In fact, one needn’t steal anything at all to be a burglar. In a sense, as a crime, it is part of the built environment; the design of any structure always implies a way to break into it. Connection to computer security left as exercise to the reader.
Trial by Machine (Roth) — The current landscape of mechanized proof, liability, and punishment suffers from predictable but underscrutinized automation pathologies: hidden subjectivities and errors in “black box” processes; distorted decision-making through oversimplified — and often dramatically inaccurate — proxies for blameworthiness; the compromise of values protected by human safety valves, such as dignity, equity, and mercy; and even too little mechanization where machines might be a powerful debiasing tool but where little political incentive exists for its development or deployment. […] The article ultimately proposes a systems approach – “trial by cyborg” – that safeguards against automation pathologies while interrogating conspicuous absences in mechanization through “equitable surveillance” and other means. (via Marginal Revolution)
Distributed Ledger Technology: Blackett Review (gov.uk) — Distributed ledgers can provide new ways of assuring ownership and provenance for goods and intellectual property. For example, Everledger provides a distributed ledger that assures the identity of diamonds, from being mined and cut to being sold and insured. In a market with a relatively high level of paper forgery, it makes attribution more efficient, and has the potential to reduce fraud and prevent “blood diamonds” from entering the market. Report includes recommendations for policy makers. (via Dan Hill)
The most interactive tasks that people do with data are essentially data wrangling. You’re changing the form of the data, you’re changing the content of the data, and at the same time you’re trying to evaluate the quality of the data and see if you’re making it the way you want it. … It’s really actually the most immersive interaction that people do with data and it’s very interesting.
Shmoocon 2016 Videos (Internet Archive) — videos of the talks from an astonishingly good security conference.
TipTalk — Samsung watchstrap that is the smart device … put your finger in your ear to hear the call. You had me at put my finger in my ear. (via WaPo)
Ecorithms — Leslie Valiant at Harvard broadened the concept of an algorithm into an “ecorithm,” which is a learning algorithm that “runs” on any system capable of interacting with its physical environment. Algorithms apply to computational systems, but ecorithms can apply to biological organisms or entire species. The concept draws a computational equivalence between the way that individuals learn and the way that entire ecosystems evolve. In both cases, ecorithms describe adaptive behavior in a mechanistic way.
Dataflow/Beam vs Spark (Google Cloud) — To highlight the distinguishing features of the Dataflow model, we’ll be comparing code side-by-side with Spark code snippets. Spark has had a huge and positive impact on the industry thanks to doing a number of things much better than other systems had done before. But Dataflow holds distinct advantages in programming model flexibility, power, and expressiveness, particularly in the out-of-order processing and real-time session management arenas.
Experience with Rules-Based Programming for Distributed Concurrent Fault-Tolerant Code (A Paper a Day) — To demonstrate applicability outside of the RAMCloud system, the team also re-wrote the Hadoop Map-Reduce job scheduler (which uses a traditional event-based state machine approach) using rules. The original code has three state machines containing 34 states with 163 different transitions, about 2,250 lines of code in total. The rules-based re-implementation required 19 rules in 3 tasks with a total of 117 lines of code and comments. Rules-based systems are powerful and underused.
OpenFace — open source face recognition software using deep neural networks.
Berkeley’s Intro-to-AI Materials — We designed these projects with three goals in mind. The projects allow students to visualize the results of the techniques they implement. They also contain code examples and clear directions, but do not force students to wade through undue amounts of scaffolding. Finally, Pac-Man provides a challenging problem environment that demands creative solutions; real-world AI problems are challenging, and Pac-Man is, too.
How to Hire (Henry Ward) — this isn’t holy writ for everyone, but the clear way in which he lays out how he thinks about hiring should be a model to all managers, even those who disagree with his specific recommendations.
From the Ground Up: Reasoning About Distributed Systems in the Real World (Tyler Treat) — When we try to provide semantics like guaranteed, exactly-once, and ordered message delivery, we usually end up with something that’s over-engineered, difficult to deploy and operate, fragile, and slow. What is the upside to all of this? Something that makes your life easier as a developer when things go perfectly well, but the reality is things don’t go perfectly well most of the time. Instead, you end up getting paged at 1 a.m. trying to figure out why RabbitMQ told your monitoring everything is awesome while proceeding to take a dump in your front yard. An approachable argument for shifting some consistency checks to application layer so the infrastructure can be simpler.
3D Printed Ceramics to 1700°C (Ars Technica) — The key step used in the new work is to replace the standard polymers used to create ceramics with a chemical that polymerizes when exposed to UV light. (These can have a variety of chemistries; the authors list thiol, vinyl, acrylate, methacrylate, and epoxy groups.) This means they’re able to be polymerized using a fairly standard 3D printer setup. In fact, the paper lists the model number of the version the authors bought from a different company.
Guesstimate — spreadsheet for things that aren’t certain.
Distributed Reactive Programming (A Paper a Day) — this week’s focus on reactive programming has been eye-opening for me. I find the implementation details less interesting than the simple notion that we can define different consistency models for reactive programs and reason about them.
Attacking HTTP/2 Implementations — Our talk focused on threats, attack vectors, and vulnerabilities found during the course of our research. Two Firefox, two Apache Traffic Server (ATS), and four Node-http2 vulnerabilities will be discussed alongside the release of the first public HTTP/2 fuzzer. We showed how these bugs were found, their root cause, why they occur, and how to trigger them.
The Autonomous Winter is Coming — The future of any given manufacturer will be determined by how successfully they manage their brands in a market split between Mobility customers and Driving customers.
Comments Off on Four short links: 10 December 2015
Toxic Workers (PDF) — In comparing the two costs, even if a firm could replace an average worker with one who performs in the top 1%, it would still be better off by replacing a toxic worker with an average worker by more than two-to-one. Harvard Business School research. (via Fortune)
Replacing Sawzall (Google) — At Google, most Sawzall analysis has been replaced by Go […] we’ve developed a set of Go libraries that we call Lingo (for Logs in Go). Lingo includes a table aggregation library that brings the powerful features of Sawzall aggregation tables to Go, using reflection to support user-defined types for table keys and values. It also provides default behavior for setting up and running a MapReduce that reads data from the logs proxy. The result is that Lingo analysis code is often as concise and simple as (and sometimes simpler than) the Sawzall equivalent.
Hospital Hacking (Bloomberg) — interesting for both lax regulation (“The FDA seems to literally be waiting for someone to be killed before they can say, ‘OK, yeah, this is something we need to worry about,’ ” Rios says.) and the extent of the problem (Last fall, analysts with TrapX Security, a firm based in San Mateo, Calif., began installing software in more than 60 hospitals to trace medical device hacks. […] After six months, TrapX concluded that all of the hospitals contained medical devices that had been infected by malware.). It may take a Vice President’s defibrillator being hacked for things to change. Or would anybody notice?
Anti-Caching (PDF) — paper outlining a clever reframing of the database strategy of keeping frequently accessed things in-memory, namely pushing to disk the things that won’t be accessed … aka, “anti-caching.”
The Rating Game (Verge) — Until companies release ratings data, we can’t know for certain whether this is true, but a study of Airbnb users found that black hosts get less money for similar listings than white hosts, and another study found that white taxi drivers get higher tips than black ones. There’s no reason such biases wouldn’t carry over to ratings.
Singa — Apache distributed deep learning platform turns 1.0.