- Behind the Scenes of a Dashboard Design — the design decisions that go into displaying complex info.
- Superconductor — a web framework for creating data visualizations that scale to real-time interactions with up to 1,000,000 data points. It compiles to WebCL, WebGL, and web workers. (via Ben Lorica)
- BIDMach: Large-scale Learning with Zero Memory Allocation (PDF) — GPU-accelerated machine learning. In this paper we describe a caching approach that allows code with complex matrix (graph) expressions at massive scale, i.e. multi-terabyte data, with zero memory allocation after the initial setup. (via Siah)
ENTRIES TAGGED "open source"
Dinosaur Tries to Suckle, Dashboard Design, Massive Visualizations, Massive Machine Learning
Inside the Nest Protect, Log Structures, Predictions, and In-Memory Data Cubes
- Nest Protect Teardown (Sparkfun) — initial teardown of another piece of domestic industrial Internet.
- Logs — The distributed log can be seen as the data structure which models the problem of consensus. Not kidding when he calls it “real-time data’s unifying abstraction”.
- Mining the Web to Predict Future Events (PDF) — Mining 22 years of news stories to predict future events. (via Ben Lorica)
- Nanocubes — a fast datastructure for in-memory data cubes developed at the Information Visualization department at AT&T Labs – Research. Nanocubes can be used to explore datasets with billions of elements at interactive rates in a web browser, and in some cases it uses sufficiently little memory that you can run a nanocube in a modern-day laptop. (via Ben Lorica)
Netflix Culture, Science Longreads, Open Source SPDY, and Internet of Invisible Buttons
- Inside Netflix’s HR (HBR) — Which idea in the culture deck was the hardest sell with employees? “Adequate performance gets a generous severance package.” It’s a pretty blunt statement of our hunger for excellence. They talk about how those conversations play out in practice.
- Top Science Longreads for 2013 (Ed Yong) — for your Christmas reading.
- CocoaSPDY — open source library for SPDY (fast HTTP replacement, supported in Chrome) for iOS and OS X.
- The Internet of Things Will Replace the Web — invisible buttons loaded with anticipatory actions keyed from mined sensor data. And we’ll complain it’s slow and doesn’t know that I don’t like The Beatles before my coffee and who wrote this crap anyway?
Bluetooth LE, Keyboard Design, Dataset API, and State Machines
- iBeacons — Bluetooth LE enabling tighter coupling of physical world with digital. I’m enamoured with the interaction possibilities: The latest Apple TV software brought a fantastically clever workaround. You just tap your iPhone to the Apple TV itself, and it passes your Wi-Fi and iTunes credentials over and sets everything up instantaneously.
- Better and Better Keyboards (Jesse Vincent) — It suffered from the same problem as every other 3D-printed keyboard I’d made to date – When I showed it to someone, they got really excited about the fact that I had a 3D printer. In contrast, whenever I showed someone one of the layered acrylic prototype keyboards I’d built, they got excited about the keyboard.
- Bamboo.io — open source modular web service for dataset storage and retrieval.
A review of my discussion with Free Software Foundation's Zak Rogoff.
Flexible Data, Google's Bottery, GPU Assist Deep Learning, and Open Sourcing
- Google’s Seven Robotics Companies (IEEE) — The seven companies are capable of creating technologies needed to build a mobile, dexterous robot. Mr. Rubin said he was pursuing additional acquisitions. Rundown of those seven companies.
- Hebel (Github) — GPU-Accelerated Deep Learning Library in Python.
- What We Learned Open Sourcing — my eye was caught by the way they offered APIs to closed source code, found and solved performance problems, then open sourced the fixed code.
Surveillance Demarcation, NYT Data Scientist, 2D Dart, and Bayesian Database
- Reform Government Surveillance — hard not to view this as a demarcation dispute. “Ruthlessly collecting every detail of online behaviour is something we do clandestinely for advertising purposes, it shouldn’t be corrupted because of your obsession over national security!”
- Brian Abelson — Data Scientist at the New York Times, blogging what he finds. He tackles questions like what makes a news app “successful” and how might we measure it. Found via this engaging interview at the quease-makingly named Content Strategist.
- StageXL — Flash-like 2D package for Dart.
- BayesDB — lets users query the probable implications of their data as easily as a SQL database lets them query the data itself. Using the built-in Bayesian Query Language (BQL), users with no statistics training can solve basic data science problems, such as detecting predictive relationships between variables, inferring missing values, simulating probable observations, and identifying statistically similar database entries. Open source.
Zombie Drones, Algebra Through Code, Data Toolkit, and Crowdsourcing Antibiotic Discovery
- Skyjack — drone that takes over other drones. Welcome to the Malware of Things.
- Bootstrap World — a curricular module for students ages 12-16, which teaches algebraic and geometric concepts through computer programming. (via Esther Wojicki)
- Harvest — open source BSD-licensed toolkit for building web applications for integrating, discovering, and reporting data. Designed for biomedical data first. (via Mozilla Science Lab)
- Project ILIAD — crowdsourced antibiotic discovery.
- SAMOA — Yahoo!’s distributed streaming machine learning (ML) framework that contains a programming abstraction for distributed streaming ML algorithms. (via Introducing SAMOA)
- madlib — an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data.
- Data Portraits: Connecting People of Opposing Views — Yahoo! Labs research to break the filter bubble. Connect people who disagree on issue X (e.g., abortion) but who agree on issue Y (e.g., Latin American interventionism), and present the differences and similarities visually (they used wordclouds). Our results suggest that organic visualisation may revert the negative effects of providing potentially sensitive content. (via MIT Technology Review)
- Disguise Detection — using Raspberry Pi, Arduino, and Python.