Pyro (Usenix) — This paper presents Pyro, a spatial-temporal big data storage system tailored for high-resolution geometry queries and dynamic hotspots. Pyro understands geometries internally, which allows range scans of a geometry query to be aggregately optimized. Moreover, Pyro employs a novel replica placement policy in the DFS layer that allows Pyro to split a region without losing data locality benefits.
Inside Mark Zuckerberg’s Bold Plan for Facebook (FastCompany) — “One of our goals for the next five to 10 years,” Zuckerberg tells me, “is to basically get better than human level at all of the primary human senses: vision, hearing, language, general cognition.”
TensorFlow — Google released, as open source, their distributed machine learning system. The DataFlow programming framework is sweet, and the documentation is gorgeous. AMAZINGLY high-quality, sets the bar for any project. This may be 2015’s most important software release.
TensorFlow White Paper (PDF) — Compared to DistBelief [G’s first scalable distributed inference and training system], TensorFlow’s programming model is more flexible, its performance is significantly better, and it supports training and using a broader range of models on a wider variety of heterogeneous hardware platforms.
Neural Networks With Few Multiplications — paper with a method to eliminate most of the time-consuming floating point multiplications needed to update the intermediate virtual neurons as they learn. Speed has been one of the bugbears of deep neural networks.
Cybersecurity as RealPolitik — Dan Geer’s excellent talk from 2014 BlackHat. When younger people ask my advice on what they should do or study to make a career in cyber security, I can only advise specialization. Those of us who were in the game early enough and who have managed to retain an over-arching generalist knowledge can’t be replaced very easily because while absorbing most new information most of the time may have been possible when we began practice, no person starting from scratch can do that now. Serial specialization is now all that can be done in any practical way. Just looking at the Black Hat program will confirm that being really good at any one of the many topics presented here all but requires shutting out the demands of being good at any others.
The Wild Wild East (The Economist) — Fung Retailing Limited, a related firm, has over 3,000 outlets, a third of them in China. Victor Fung, its honorary chairman, sees the era of mass production giving way to one of mass customization. Markets are fragmenting and smartphones are empowering consumers to get “directly involved in what they buy, where it is made and how they buy it.” Zhao Xiande of CEIBS in Shanghai points to Red Collar, a firm that used simply to make and export garments. Now it lets customers the world over design their own shirts online and makes them to order. Another outfit, Home Koo, offers custom-built furniture online.
Motivation for a Monolithic Codebase (YouTube) — interesting talk about Google’s codebase, the first time I know of that Google’s strategy for source code management was discussed in public.
China Extracting Pledge of Compliance from US Firms (NY Times) — The letter also asks the American companies to ensure their products are “secure and controllable,” a catchphrase that industry groups said could be used to force companies to build so-called back doors — which allow third-party access to systems — provide encryption keys or even hand over source code.
Toyota’s Robot Car Plans (IEEE Spectrum) — Toyota hired the former head of DARPA’s Robotics Challenge. Pratt explained that a U.S. $50 million R&D collaboration with MIT and Stanford is just the beginning of a large and ambitious program whose goal is developing intelligent vehicles that can make roads safer and robot helpers that can improve people’s lives at home.
Denver Broncos Testing In-Game Analytics — their newly hired director of analytics working with the coach. With Tanney nearby, Kubiak can receive a quick report on the statistical probabilities of almost any situation. Say that you have fourth-and-3 from the opponent’s 45-yard-line with four minutes to go. Do the large-sample-size percentages make the risk-reward ratio acceptable enough to go for it? Tanney’s analytics can provide insight to aid Kubiak’s decision-making. (via Flowing Data)
Visual Review (GitHub) — Apache-licensed productive and human-friendly workflow for testing and reviewing your Web application’s layout for any regressions.
Large-scale Cluster Management at Google with Borg — Google’s Borg system is a cluster manager that runs hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters, each with up to tens of thousands of machines. […] We present a summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it.
Georgia Sues Carl Malamud (TechDirt) — for copyright infringement… for publishing an official annotated copy of the state's laws. […] the state points directly to the annotated version as the official laws of the state.
Andrew Ng (Wired) — I think self-driving cars are a little further out than most people think. There’s a debate about which one of two universes we’re in. In the first universe it’s an incremental path to self-driving cars, meaning you have cruise control, adaptive cruise control, then self-driving cars only on the highways, and you keep adding stuff until 20 years from now you have a self-driving car. In universe two you have one organization, maybe Carnegie Mellon or Google, that invents a self-driving car and bam! You have self-driving cars. It wasn’t available Tuesday but it’s on sale on Wednesday. I’m in universe one. I think there’s a lot of confusion about how easy it is to do self-driving cars. There’s a big difference between being able to drive a thousand miles, versus being able to drive anywhere. And it turns out that machine-learning technology is good at pushing performance from 90 to 99 percent accuracy. But it’s challenging to get to four nines (99.99 percent). I’ll give you this: we’re firmly on our way to being safer than a drunk driver.
Google Cloud BigTable — Google’s BigTable, with Apache HBase API, single-digit millisecond latency, and “fully managed”. G are hell-bent on catching up with Amazon and Microsoft at this cloud serving thing.
Call Me Maybe: Aerospike — We’re setting a timeout of 500ms here, and operations still time out every time a partition between nodes occurs. In these tests we aren’t interfering with client-server traffic at all. Aerospike may claim “100% uptime”, but this is only meaningful with respect to particular latency bounds. Given Aerospike claims millisecond-scale latencies, you may want to reconsider whether you consider this “uptime”.
Decoding Jeff Jonas (National Geographic) — “He thinks in three—no, four dimensions,” Nathan says. “He has a data warehouse in his head.” And that’s where the work takes place—in his head. Not on paper. Not on a computer. He resorts to paper only to work the details out. When asked about his thought process, Jonas reaches for words, then says: “It’s like a Rubik’s Cube. It all clicks into place. “The solution,” he says, is “simply there to find.” Jeff’s a genius and has his own language for explaining what he does. This quote goes a long way to explaining it.
How Apple Uses Mesos for Siri — great to see not only some details of the tooling that Apple built, but also their acknowledgement of the open source foundations and ongoing engagement with those open source communities. There have been times in the past when Apple felt like a parasite on the commons rather than a participant.
Cheaper Bandwidth or Bust: How Google Saved YouTube (ArsTechnica) — Remember YouTube’s $2 million-a-month bandwidth bill before the Google acquisition? While it wasn’t an overnight transition, apply Google’s data center expertise, and this cost drops to about $666,000 a month.
AWS Business Numbers — Amazon Web Services generated $5.2 billion over the past four quarters, and almost $700 million in operating income. During the first quarter of 2015, AWS sales reached $1.6 billion, up 49% year-over-year, and roughly 7% of Amazon’s overall sales.
The Web’s Grain (Frank Chimero) — What would happen if we stopped treating the web like a blank canvas to paint on, and instead like a material to build with?
Bruce Sterling on Convergence of Humans and Machines — I like to use the terms “cognition” and “computation”. Cognition is something that happens in brains, physical, biological brains. Computation is a thing that happens with software strings on electronic tracks that are inscribed out of silicon and put on fibre board. They are not the same thing, and saying that makes the same mistake as in earlier times, when people said that human thought was like a steam engine.
Smart Pocket Watch — I love to see people trying different design experiences. This is beautiful. And built on Firefox OS!
Knowledge-Based Trust (PDF) — Google research paper on how to assess factual accuracy of web page content. It was bad enough when Google incentivised people to make content-free pages. Next there’ll be a reward for scamming bogus facts into Google’s facts database.