- MillWheel: Fault-Tolerant Stream Processing at Internet Scale — Google Research paper on the tech underlying the new cloud DataFlow tool. Watch the video. Yow.
- The Integer Overflow Bug That Went to Mars — long-standing (20 year old!) bug in a compression library prompts a wave of new releases. No word yet on whether NASA will upgrade the rover to avoid being pwned by Martian script kiddies. (update: I fell for a self-promoter. The Martians will need to find another attack vector. Huzzah!)
- epoch (github) — Fastly-produced open source general purpose real-time charting library for building beautiful, smooth, and high performance visualizations.
- Achieving Rapid Response Times in Large Online Services (YouTube) — Jeff Dean‘s keynote at Velocity. He wrote … a lot of things for this. And now he’s into deep learning ….
ENTRIES TAGGED "viz"
Google MillWheel, 20yo Bug, Fast Real-Time Visualizations, and Google's Speed King
- China’s $122BB Boom in Shadow Banking is Happening on Phones (Quartz) — Tencent’s recently launched online money market fund (MMF), Licai Tong, drew in 10 billion yuan ($1.7 billion) in just six days in the last week of January.
- The Weight of Rain — lovely talk about the thought processes behind coming up with a truly insightful visualisation.
- Data on Video Streaming Starting to Emerge (Giga Om) — M-Lab, which gathers broadband performance data and distributes that data to the FCC, has uncovered significant slowdowns in throughput on Comcast, Time Warner Cable and AT&T. Such slowdowns could be indicative of deliberate actions taken at interconnection points by ISPs.
Remote Working, Google Visualizations, Sensing Gamma Rays, and Cheap GPS For Your Arduino
- Making Remote Work — The reality of a remote workplace is that the connections are largely artificial constructs. People can be very, very isolated. A person’s default behavior when they go into a funk is to avoid seeking out interactions, which is effectively the same as actively withdrawing in a remote work environment. It takes a tremendous effort to get on video chats, use our text based communication tools, or even call someone during a dark time. Very good to see this addressed in a post about remote work.
- Google Big Picture Group — public output from the visualization research group at Google.
- Using CMOS Sensors in a Cellphone for Gamma Detection and Classification (Arxiv) — another sense in your pocket. The CMOS camera found in many cellphones is sensitive to ionized electrons. Gamma rays penetrate into the phone and produce ionized electrons that are then detected by the camera. Thermal noise and other noise needs to be removed on the phone, which requires an algorithm that has relatively low memory and computational requirements. The continuous high-delta algorithm described fits those requirements. (via Medium)
- Affordable Arduino-Compatible Centimeter-Level GPS Accuracy (IndieGogo) — for less than $20. (via DIY Drones)
Dinosaur Tries to Suckle, Dashboard Design, Massive Visualizations, Massive Machine Learning
- Behind the Scenes of a Dashboard Design — the design decisions that go into displaying complex info.
- Superconductor — a web framework for creating data visualizations that scale to real-time interactions with up to 1,000,000 data points. It compiles to WebCL, WebGL, and web workers. (via Ben Lorica)
- BIDMach: Large-scale Learning with Zero Memory Allocation (PDF) — GPU-accelerated machine learning. In this paper we describe a caching approach that allows code with complex matrix (graph) expressions at massive scale, i.e. multi-terabyte data, with zero memory allocation after the initial setup. (via Siah)
- SAMOA — Yahoo!’s distributed streaming machine learning (ML) framework that contains a programming abstraction for distributed streaming ML algorithms. (via Introducing SAMOA)
- madlib — an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data.
- Data Portraits: Connecting People of Opposing Views — Yahoo! Labs research to break the filter bubble. Connect people who disagree on issue X (e.g., abortion) but who agree on issue Y (e.g., Latin American interventionism), and present the differences and similarities visually (they used wordclouds). Our results suggest that organic visualisation may revert the negative effects of providing potentially sensitive content. (via MIT Technology Review)
- Disguise Detection — using Raspberry Pi, Arduino, and Python.
HTML DRM, Visualizing Medical Sciences, Lifelong Learning, and Hardware Hackery
- What Tim Berners-Lee Doesn’t Know About HTML DRM (Guardian) — Cory Doctorow lays it out straight. HTML DRM is a bad idea, no two ways. The future of the Web is the future of the world, because everything we do today involves the net and everything we’ll do tomorrow will require it. Now it proposes to sell out that trust, on the grounds that Big Content will lock up its “content” in Flash if it doesn’t get a veto over Web-innovation. [...] The W3C has a duty to send the DRM-peddlers packing, just as the US courts did in the case of digital TV.
- Visualizing the Topical Structure of the Medical Sciences: A Self-Organizing Map Approach (PLOSone) — a high-resolution visualization of the medical knowledge domain using the self-organizing map (SOM) method, based on a corpus of over two million publications.
- What Teens Get About The Internet That Parents Don’t (The Atlantic) — the Internet has been a lifeline for self-directed learning and connection to peers. In our research, we found that parents more often than not have a negative view of the role of the Internet in learning, but young people almost always have a positive one. (via Clive Thompson)
- Portable C64 — beautiful piece of C64 hardware hacking to embed a screen and battery in it. (via Hackaday)
- RebelMouse — aggregates FB, Twitter, Instagram, G+ content w/Pinboard-like aesthetics. It’s like aggregators we’ve had since 2004, but in this Brave New World we have to authenticate to a blogging service to get our own public posts out in a machine-readable form. 2012: it’s like 2000 but now we have FOUR AOLs! We’ve traded paywalls for graywalls, but the walls are still there. (via Poynter)
- Data Visualization Course Wiki — wiki for Stanford course cs448b, covering visualization with examples and critiques.
- Peristaltic Pump — for your Arduino medical projects, a pump that doesn’t touch the liquid it moves so the liquid can stay sterile.