"visualization" entries

Use data or be data

Trina Chiasson argues that data has arrived at the same threshold as coding: code or be coded; learn to use data or be data.

Trina_Chiasson

Trina Chiasson

Arguments from all sides have surrounded the question of whether or not everyone should learn to code. Trina Chiasson, co-founder and CEO of Infoactive, says learning to code changed her life for the better. “These days I don’t spend a lot of time writing code,” she says, “but it’s incredibly helpful for me to be able to communicate with our engineers and communicate with other people in the industry.”

Though helpful for her personally, she admits that it takes quite a lot of time and commitment to learn to code to any level of proficiency, and that it might not be the best use of time for everyone. What should people commit time to learn? How to use data. Read more…

Comment
Four short links: 15 September 2014

Four short links: 15 September 2014

Weird Machines, Libraries May Scan, Causal Effects, and Crappy Dashboards

  1. The Care and Feeding of Weird Machines Found in Executable Metadata (YouTube) — talk from 29th Chaos Communication Congress, on using tricking the ELF linker/loader into arbitrary computation from the metadata supplied. Yes, there’s a brainfuck compiler that turns code into metadata which is then, through a supernatural mix of pixies, steam engines, and binary, executed. This will make your brain leak. Weird machines are everywhere.
  2. European Libraries May Digitise Books Without Permission“The right of libraries to communicate, by dedicated terminals, the works they hold in their collections would risk being rendered largely meaningless, or indeed ineffective, if they did not have an ancillary right to digitize the works in question,” the court said. Even if the rights holder offers a library the possibility of licensing his works on appropriate terms, the library can use the exception to publish works on electronic terminals, the court ruled. “Otherwise, the library could not realize its core mission or promote the public interest in promoting research and private study,” it said.
  3. CausalImpact (GitHub) — Google’s R package for estimating the causal effect of a designed intervention on a time series. (via Google Open Source Blog)
  4. Laws of Crappy Dashboards — (caution, NSFW language … “crappy” is my paraphrase) so true. Not talking to users will result in a [crappy] dashboard. You don’t know if the dashboard is going to be useful. But you don’t talk to the users to figure it out. Or you just show it to them for a minute (with someone else’s data), never giving them a chance to figure out what the hell they could do with it if you gave it to them.
Comment: 1

Building pipelines to facilitate data analysis

A new operator from the magrittr package makes it easier to use R for data analysis.

Construction_of_Cedar_River_Pipeline_1900

In every data analysis, you have to string together many tools. You need tools for data wrangling, visualisation, and modelling to understand what’s going on in your data. To use these tools effectively, you need to be able to easily flow from one tool to the next, focusing on asking and answering questions of the data, not struggling to jam the output from one function into the format needed for the next. Wouldn’t it be nice if the world worked this way! I spend a lot of my time thinking about this problem, and how to make the process of data analysis as fast, effective, and expressive as possible. Today, I want to show you a new technique that I’m particularly excited about.

R, at its heart, is a functional programming language: you do data analysis in R by composing functions. However, the problem with function composition is that a lot of it makes for hard-to-read code. For example, here’s some R code that wrangles flight delay data from New York City in 2013. What does it do? Read more…

Comment: 1

5 reasons to learn D3

D3 doesn’t stand for data-design dictator

Designers and developers making data visualizations on the web are buzzing about d3.js. But why? Read more…

Comment: 1
Four short links: 11 February 2014

Four short links: 11 February 2014

Shadow Banking, Visualization Thoughts, Streaming Video Data, and Javascript Puzzlers

  1. China’s $122BB Boom in Shadow Banking is Happening on Phones (Quartz) — Tencent’s recently launched online money market fund (MMF), Licai Tong, drew in 10 billion yuan ($1.7 billion) in just six days in the last week of January.
  2. The Weight of Rain — lovely talk about the thought processes behind coming up with a truly insightful visualisation.
  3. Data on Video Streaming Starting to Emerge (Giga Om) — M-Lab, which gathers broadband performance data and distributes that data to the FCC, has uncovered significant slowdowns in throughput on Comcast, Time Warner Cable and AT&T. Such slowdowns could be indicative of deliberate actions taken at interconnection points by ISPs.
  4. Javascript Puzzlers — how well do you know Javascript?
Comment: 1
Four short links: 17 January 2014

Four short links: 17 January 2014

Remote Working, Google Visualizations, Sensing Gamma Rays, and Cheap GPS For Your Arduino

  1. Making Remote WorkThe real­ity of a remote work­place is that the con­nec­tions are largely arti­fi­cial con­structs. Peo­ple can be very, very iso­lated. A person’s default behav­ior when they go into a funk is to avoid seek­ing out inter­ac­tions, which is effec­tively the same as actively with­draw­ing in a remote work envi­ron­ment. It takes a tremen­dous effort to get on video chats, use our text based com­mu­ni­ca­tion tools, or even call some­one dur­ing a dark time. Very good to see this addressed in a post about remote work.
  2. Google Big Picture Group — public output from the visualization research group at Google.
  3. Using CMOS Sensors in a Cellphone for Gamma Detection and Classification (Arxiv) — another sense in your pocket. The CMOS camera found in many cellphones is sensitive to ionized electrons. Gamma rays penetrate into the phone and produce ionized electrons that are then detected by the camera. Thermal noise and other noise needs to be removed on the phone, which requires an algorithm that has relatively low memory and computational requirements. The continuous high-delta algorithm described fits those requirements. (via Medium)
  4. Affordable Arduino-Compatible Centimeter-Level GPS Accuracy (IndieGogo) — for less than $20. (via DIY Drones)
Comment
Four short links: 3 December 2013

Four short links: 3 December 2013

  1. SAMOA — Yahoo!’s distributed streaming machine learning (ML) framework that contains a programming abstraction for distributed streaming ML algorithms. (via Introducing SAMOA)
  2. madliban open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data.
  3. Data Portraits: Connecting People of Opposing Views — Yahoo! Labs research to break the filter bubble. Connect people who disagree on issue X (e.g., abortion) but who agree on issue Y (e.g., Latin American interventionism), and present the differences and similarities visually (they used wordclouds). Our results suggest that organic visualisation may revert the negative effects of providing potentially sensitive content. (via MIT Technology Review)
  4. Disguise Detection — using Raspberry Pi, Arduino, and Python.
Comment
Four short links: 27 November 2013

Four short links: 27 November 2013

3D Fossils, Changing Drone Uses, High Scalability, and Sim Redux

  1. CT Scanning and 3D Printing for Paleo (Scientific American) — using CT scanners to identify bones still in rock, then using 3D printers to recreate them. (via BoingBoing)
  2. Growing the Use of Drones in Agriculture (Forbes) — According to Sue Rosenstock, 3D Robotics spokesperson, a third of their customers consist of hobbyists, another third of enterprise users, and a third use their drones as consumer tools. “Over time, we expect that to change as we make more enterprise-focused products, such as mapping applications,” she explains. (via Chris Anderson)
  3. Serving 1M Load-Balanced Requests/Second (Google Cloud Platform blog) — 7m from empty project to serving 1M requests/second. I remember when 1 request/second was considered insanely busy. (via Forbes)
  4. Boil Up — behind the scenes for the design and coding of a real-time simulation for a museum’s science exhibit. (via Courtney Johnston)
Comment
Four short links: 25 October 2013

Four short links: 25 October 2013

Disk Over Ethernet, Inside Elite, Polar Charts, and R Videos

  1. Seagate Kinetic Storage — In the words of Geoff Arnold: The physical interconnect to the disk drive is now Ethernet. The interface is a simple key-value object oriented access scheme, implemented using Google Protocol Buffers. It supports key-based CRUD (create, read, update and delete); it also implements third-party transfers (“transfer the objects with keys X, Y and Z to the drive with IP address 1.2.3.4”). Configuration is based on DHCP, and everything can be authenticated and encrypted. The system supports a variety of key schemas to make it easy for various storage services to shard the data across multiple drives.
  2. Masters of Their Universe (Guardian) — well-written and fascinating story of the creation of the Elite game (one founder of which went on to make the Raspberry Pi). The classic action game of the early 1980s – Defender, Pac Man – was set in a perpetual present tense, a sort of arcade Eden in which there were always enemies to zap or gobble, but nothing ever changed apart from the score. By letting the player tool up with better guns, Bell and Braben were introducing a whole new dimension, the dimension of time.
  3. Micropolar (github) — A tiny polar charts library made with D3.js.
  4. Introduction to R (YouTube) — 21 short videos from Google.
Comment: 1
Four short links: 24 October 2013

Four short links: 24 October 2013

Visual Arduino Coding, Hardware Iteration, Segmenting Images, and Client-Side Adjustable Data View

  1. Visually Programming Arduino — good for little minds.
  2. Rapid Hardware Iteration at Scale (Forbes) — It’s part of the unique way that Xiaomi operates, closely analyzing the user feedback it gets on its smartphones and following the suggestions it likes for the next batch of 100,000 phones. It releases them every Tuesday at noon Beijing time.
  3. Machine Learning of Hierarchical Clustering to Segment 2D and 3D Images (PLoS One) — We propose an active learning approach for performing hierarchical agglomerative segmentation from superpixels. Our method combines multiple features at all scales of the agglomerative process, works for data with an arbitrary number of dimensions, and scales to very large datasets.
  4. Kratuan Open Source client-side analysis framework to create simple yet powerful renditions of data. It allows you to dynamically adjust your view of the data to highlight issues, opportunities and correlations in the data.
Comment