- Disinformation Visualisation: How to Lie with Datavis — We don’t spread visual lies by presenting false data. That would be lying. We lie by misrepresenting the data to tell the very specific story we’re interested in telling. If this is making you slightly uncomfortable, that’s a good thing; it should. If you’re concerned about adopting this new and scary habit, well, don’t worry; it’s not new. Just open your CV to be reminded you’ve lied with truthful data before. This time, however, it will be explicit and visual. (via Regine Debatty)
- Microtugs — a new type of small robot that can apply orders of magnitude more force than it weighs. This is in stark contrast to previous small robots that have become progressively better at moving and sensing, but lacked the ability to change the world through the application of human-scale loads.
- Vault — a tool for securely managing secrets and encrypting data in-transit.
- iSAX: Indexing and Mining Terabyte Sized Time Series (PDF) — Our approach allows both fast exact search and ultra-fast approximate search. We show how to exploit the combination of both types of search as sub-routines in data mining algorithms, allowing for the exact mining of truly massive real-world data sets, containing millions of time series. (via Benjamin Black)
"data visualization" entries
Human judgment is at the center of successful data analysis. This statement might initially seem at odds with the current Big Data frenzy and its focus on data management and machine learning methods. But while these tools provide immense value, it is important to remember that they are just that: tools. A hammer does not a carpenter make — though it certainly helps.
Consider the words of John Tukey 1, possibly the greatest statistician of the last half-century: “Nothing — not the careful logic of mathematics, not statistical models and theories, not the awesome arithmetic power of modern computers — nothing can substitute here for the flexibility of the informed human mind. Accordingly, both approaches and techniques need to be structured so as to facilitate human involvement and intervention.” Tukey goes on to write: “Some implications for effective data analysis are: (1) that it is essential to have convenience of interaction of people and intermediate results and (2) that at all stages of data analysis the nature and detail of output need to be matched to the capabilities of the people who use it and want it.” Though Tukey and colleagues voiced these sentiments nearly 50 years ago, they ring even more true today. The interested analyst is at the heart of the Big Data question: how well do our tools help users ask better questions, formulate hypotheses, spot anomalies, correct errors and create improved models and visualizations? To “facilitate human involvement” across “all stages of data analysis” is a grand challenge for our age.
A much needed break away from data transparency and privacy issues
I could have focused on the Governments Search for Google Data visualization from Chris Canipe and Madeline Farbman of the Wall Street Journal. Or, I could have focused on Neal Ungerleider’s piece that covers Eric Fisher and MapBox for Gnip’s twitter metadata visualizations. Yet, my curiosity took over once I came across The Economist’s High Spirits graphic. Not only do I make my own bitters which qualifies me for preliminary booze nerd status, I also needed a brief break away from the transparency issues currently dominating the data-oriented conversations. Following my booze nerd curiosity led me to this interactive data visualization of common cocktail ingredients:
Notes and links from the data journalism beat
Data journalism is becoming a truly global practice. Data journalists from the UK, China, and the US are sharing data-oriented best practices, insights, and tools. Journalists in Latin America are meeting this week to push for more transparency and access to data in the region. At the same time, recent revelations about NSA domestic surveillance programs have pushed big data stories to the front pages of US papers. Here are a few links from the past week:
Transparency…or Lack Thereof
- OpenData Latinoamérica: Driving the demand side of data and scraping towards transparency (Neiman Journalism Lab)
“There’s a saying here, and I’ll translate, because it’s very much how we work,” Miguel Paz said to me over a Skype call from Chile. “But that doesn’t mean that it’s illegal. Here, it’s ‘It’s better to ask forgiveness than to ask permission.” Paz is a veteran of the digital news business. The saying has to do with his approach to scraping public data from governments that may be slow to share it.
- The real story in the NSA scandal is the collapse of journalism (zdnet.com)
On Thursday, June 6, the Washington Post published a bombshell of a story, alleging that nine giants of the tech industry had “knowingly participated” in a widespread program by the United States National Security Agency (NSA). One day later, with no acknowledgment except for a change in the timestamp, the Post revised the story, backing down from sensational claims it made originally. But the damage was already done.
- We are shocked, shocked… (davidsimon.com)
Having labored as a police reporter in the days before the Patriot Act, I can assure all there has always been a stage before the wiretap, a preliminary process involving the capture, retention and analysis of raw data. It has been so for decades now in this country. The only thing new here, from a legal standpoint, is the scale on which the FBI and NSA are apparently attempting to cull anti-terrorism leads from that data. But the legal and moral principles? Same old stuff.
- Big Data Has Big Stage at Personal Democracy Forum (pbs.org)
Engaging News Project’s Talia Stroud tackled the issue of public engagement in news organizations. Polls on websites don’t yield scientifically accurate results, nor do they get people to address difficult issues, she said. “These data are junk. We know they’re junk,” Stroud said. “City council representatives know they’re junk. Even news organizations know that the results of these data are junk. The only reason that this poll is being included on the news organization’s site is to increase interactivity and increase your time on page.”
An interview with Scott Murray, author of Interactive Data Visualization for the Web
Scott Murray, a code artist, has written Interactive Data Visualization for the Web for nonprogrammers. In this interview, Scott provides some insights on what inspired him to write an introduction to D3 for artists, graphic designers, journalists, researchers, or anyone that is looking to begin programming data visualizations.
What inspired you to become a code artist?
Scott Murray: I had designed websites for a long time, but several years ago was frustrated by web browsers’ limitations. I went back to school for an MFA to force myself to explore interactive options beyond the browser. At MassArt, I was introduced to Processing, the free programming environment for artists. It opened up a whole new world of programmatic means of manipulating and interacting with data — and not just traditional data sets, but also live “data” such as from input devices or dynamic APIs, which can then be used to manipulate the output. Processing let me start prototyping ideas immediately; it is so enjoyable to be able to build something that really works, rather than designing static mockups first, and then hopefully, one day, invest the time to program it. Something about that shift in process is both empowering and liberating — being able to express your ideas quickly in code, and watch the system carry out your instructions, ultimately creating images and experiences that are beyond what you had originally envisioned.
The Wikipedia Recent Changes Map visualizes Wikipedia edits around the world in real-time.
Stephen LaPorte and Mahmoud Hashemi have put together an addictive visualization of real-time edits on Wikipedia, mapped across the world. Every time an edit is made, the user’s location and the entry they edited are listed along with a corresponding dot on the map.
Visual analysis tools are adding advanced analytics for big data
After recently playing with SAS Visual Analytics, I’ve been thinking about tools for visual analysis. By visual analysis I mean the type of analysis most recently popularized by Tableau, QlikView, and Spotfire: you encounter a data set for the first time, conduct exploratory data analysis, with the goal of discovering interesting patterns and associations. Having used a few visualization tools myself, here’s a quick wish-list of features (culled from tools I’ve used or have seen in action).
Requires little (to no) coding
The viz tools I currently use require programming skills. Coding means switching back-and-forth between a visual (chart) and text (code). It’s nice1 to be able to customize charts via code, but when you’re in the exploratory phase not having to think about code syntax is ideal. Plus GUI-based tools allow you to collaborate with many more users.