Visualizing and Categorizing the 911 Wikileaks Data Set

On November 25th, Wikileaks released 500,000 text pager intercepts from the 24 hours surrounding the horrific 9/11 attacks. The personal, corporate and governmental come from the Washington D.C. and New York City areas. These can be found on their own subdomain at http://911.wikileaks.org/ and are released under the CC-BY-SA license.

As with the AOL search logs and the Enron email archives this data set will be examined and visualized. I am sure that the hope will be to gain an understanding of the thoughts and feelings of the people on the ground. Two applications have already been created.

911pagers is a site devoted to searching and community annotations of the corpus. You can see the theoretical communications with Guiliani (New York’s Mayor at the time), random messages or the recommended . Built on Google Appengine, 911pagers will add ratings, timelines and keywords. You can track their progress via @911pagers.

911 text intercepts word viz

The second project is an analysis of the frequency of 100 phrases such “flights cancelled” and “call home”. Jeff Clark selected only the content from 8AM to 8PM on 9/11. He created a set of timeline graphs for each phrase (above). After the jump I’ve embedded a timeseries video of these phrases.

Here’s the video:

Pager Data from 9/11 – Phrase Cloud Visualization from Jeff Clark on Vimeo.

Jeff says this about the video:

This is a visualization of text phrases taken from pager data during September 11th, 2001 from 8am until 8pm. The larger the text the more frequently it was used during the 12 hour period. Text appears bright during the times of high usage and fades away otherwise. Color hues are cosmetic.