- Dataflow Computers: Their History and Future (PDF) — entry from 2008 Wiley Encyclopedia of Computer Science and Engineering.
- Mirador — open source tool for visual exploration of complex data sets. It enables users to discover correlation patterns and derive new hypotheses from the data.
- How 23AndMe Got Regulatory Approval Back (Fast Company) — In order to meet FDA requirements, the design team had to prove that the reports provided on the website would be comprehensible to any American consumer, regardless of their background or education level. And you thought YOUR design brief was hard.
- Getting Comfortable with Uncertainty (The Atlantic) — We have this natural distaste for things that are unfamiliar to us, things that are ambiguous. It goes up from situational stressors, on an individual level and a group level. And we’re stuck with it simply because we have to be ambiguity-reducers.
Trina Chiasson argues that data has arrived at the same threshold as coding: code or be coded; learn to use data or be data.
Arguments from all sides have surrounded the question of whether or not everyone should learn to code. Trina Chiasson, co-founder and CEO of Infoactive, says learning to code changed her life for the better. “These days I don’t spend a lot of time writing code,” she says, “but it’s incredibly helpful for me to be able to communicate with our engineers and communicate with other people in the industry.”
Though helpful for her personally, she admits that it takes quite a lot of time and commitment to learn to code to any level of proficiency, and that it might not be the best use of time for everyone. What should people commit time to learn? How to use data. Read more…
A new operator from the magrittr package makes it easier to use R for data analysis.
In every data analysis, you have to string together many tools. You need tools for data wrangling, visualisation, and modelling to understand what’s going on in your data. To use these tools effectively, you need to be able to easily flow from one tool to the next, focusing on asking and answering questions of the data, not struggling to jam the output from one function into the format needed for the next. Wouldn’t it be nice if the world worked this way! I spend a lot of my time thinking about this problem, and how to make the process of data analysis as fast, effective, and expressive as possible. Today, I want to show you a new technique that I’m particularly excited about.
R, at its heart, is a functional programming language: you do data analysis in R by composing functions. However, the problem with function composition is that a lot of it makes for hard-to-read code. For example, here’s some R code that wrangles flight delay data from New York City in 2013. What does it do? Read more…