- The Joy of Stats — Hans Rosling’s BBC documentary on statistics, available to watch online.
- Best Tech Writing of 2010 — I need a mass “add these to Instapaper” button. (via Hacker News)
- Google Shared Spaces: Why We Made It (Pamela Fox) — came out of what people were trying to do with Google Wave.
- The Great Delicious Exodus — traffic graph as experienced by pinboard.
ENTRIES TAGGED "visualizations"
GAE Datastore, Datamining Books, Processing Word Clouds, and URL Design
- datastore — implementation of Google App Engine Datastore in Java, running on hbase and hadoop. (via Hacker News)
- Mining of Massive Datasets — 340 page book from Stanford with the best copyright cautionary coverletter: we expect that you will acknowledge our authorship if you republish parts or all of it. We are sorry to have to mention this point, but we have evidence that other items we have published on the Web have been appropriated and republished under other names. It is easy to detect such misuse, by the way, as you will learn in Chapter 3. (via Delicious)
- Wordcram — generate word clouds in Processing. (via jandot on Twitter)
- URL Design — the why and how of designing your URLs. Must-read. (via kneath on Twitter)
- 30 Lessons Learned in Computing Over The Last 10 Years — Backup every day at the minimum, and test restores every week. I don’t think I’ve worked at an organisation that didn’t discover at one point that they couldn’t restore from their backups. Many other words of wisdom, and this one rang particularly true: all code turns into shit given enough time and hands. (via Hacker News)
- What Your Computer Does While You Wait — top-to-bottom understanding of your system makes you a better programmer.
- How to Visualize the Competition — elegant graphing of strategy. (via Dave Moskovitz on Twitter)
The good news: Open data is viewed positively. The bad: There's lots of room for improvement.
A new report on the attitudes, quality and use of open government data shows strong support for the release of open data among citizens and government employees.
Stack Exchange goes in-house, Netflix pays for platforms, survey data gets visualized, and Infochimps acquires Data Marketplace
In this edition of Strata Week: Stack Exchange takes their hardware and software in-house; Neflix explains their adoption of AWS and open source; the New York Times maps out survey and census data; and Infochimps acquires Data Marketplace.
Twitter Influence, Open Source Visualized, Arduino Autopilot, and Customer Respect
- The Million Follower Fallacy (PDF) — We found that indegree represents a user’s popularity, but is not related to other important notions of inﬂuence such as engaging audience, i.e., retweets and mentions. Retweets are driven by the content value of a tweet, while mentions are driven by the name value of the user. Such subtle differences lead to dissimilar groups of the top Twitter users; users who have high indegree do not necessarily spawn many retweets or mentions. This ﬁnding suggests that indegree alone reveals very little about the inﬂuence of a user. Research confirms what we all knew, that idiots who chase follower numbers have the influence they deserve. (via Steve O’Grady on Twitter, indirectly)
- Geocoding Github: Visualizing Distributed Open-Source Development — work for the Stanford visualization class, plotting open source commits on maps over time. See this page for the interactive explorer. (via Michael Driscoll on Twitter)
- ArduPilotMega 1.0 Launched — autopilot built on the Arduino platform. (via Chris Anderson on Twitter)
- Lessons of the Gawker Security Mess (Forbes blog) — nice deconstruction of what happened. In the chat, Gawker’s Hamilton Nolan, after hearing that it is just Gawker users who have been compromised, remarks “oh, well. unimportant”. Gawker’s Richard Lawson wants to know if the breach is limited to “just the peasants?” Don’t trash talk about your users in company channels. The business that forgets it lives and dies on its customers is a business that will eventually be hated by its customers. (via Nahum Wild on Twitter)
Use Gephi and Python to find your personal communities
Using a bit of Python and the Gephi graph tool, exploring your own Twitter network is a great way to learn about analyzing networks: and the results definitely have a "wow" factor.
Powerful open source graph manipulation
A Photoshop for data, Gephi is a powerful tool for exploring and presenting data as a graph. It's easy to get started with sample data sets, then import your own by generating files in a standard graph format.
Data geekery, visualization and journalism
From deep-diving startup founders to national newspapers, there's a rich vein of wisdom and information in blogs about data. Here's five to get your reading list started.
Easy-to-use timelines catch on with consumers and publishers.
Dipity is making it easier for businesses, media outlets and individual users to create interactive timelines. In the following interview, Dipity co-founder and CEO Derek Dukes discusses the company's business model and the opportunities that come when rich datasets are matched with user-friendly interfaces.