Welcome to the second weekly edition of our data blog! You may have noticed that we’ve decided to change our name to “Strata Week.” This is in keeping with the name of our upcoming data conference as well as our ideas about where the world of data is headed (think myriad layers that can be mined, and large, cloud-filled heavens to explore). Thanks to those of you who have sent in suggestions so far; we hope you’ll keep ’em coming.
Mint goes on a data spree
Intuit’s Mint personal finance assistant has begun to share the large quantities of consumer spending data they’ve collected from their 3 million users. One can look at the most popular spending spots, the average purchase total, average monthly spending in your city, etc. In an attempt to anonymize the data, Mint is only sharing information about venues with 50 or more Mint-using customers. Here’s hoping for an API to come.
All the storage money can buy
Speaking of money, if you’ve got some spare change lying around, you might be interested in bidding on some compression technology patents coming up for auction on Nov. 11.
Stopping crime before it starts
Don’t have your own money? Thinking of stealing some? Think again: “predictive policing” is not just for the movies anymore. The Los Angeles Police Department is leading the way in using data to be proactive, instead of reactive, about crime. “Much as an earthquake sets off aftershocks, some types of crimes have a contagious quality to them,” writes Joel Rubin of the Los Angeles Times. So mathematicians at UCLA are adapting seismologic algorithms that calculate the probability of aftershocks to fit crime patterns and decision-making processes.
The LAPD is competing for a $3 million U.S. Justice Department grant that would allow them to conduct a large-scale experiment with predictive techniques.
New data viz tool for journalists: TimeFlow
For all those journalists who have to report on the crimes that do happen, visualization power duo Martin Wattenberg and Fernanda B.Viégas have created yet another fabulous viz tool, this one commissioned by Sarah Cohen, Knight Professor of the Practice of Journalism and Public Policy at Duke University’s Sanford School of Public Policy. TimeFlow is an open source analytical timeline that lets users filter information by date, location, tags, and several other parameters.
Many of you will already be familiar with Wattenberg and Viégas’ work if you’ve used the IBM toolkit Many Eyes or seen their visualization project of Wikipedia edits (if you missed that one, check out their chapter in Beautiful Visualization). They’re both now working at Google.
Checking up on the government
At this week’s Gov 2.0 Summit in Washington DC, Ellen Miller of The Sunlight Foundation presented an “Open Government Scorecard” wherein she was highly critical of the data reported at USASpending.gov, a site meant to report how Americans’ tax dollars are spent.
Miller also announced a new program from the Sunlight Foundation called ClearSpending, which aggregates the numbers from the 10 million rows of the USASpending database and compares the totals for each program to those reported in the Catalog of Federal Domestic Assistance (CFDA). The program found that more than $1.3 trillion — or fully half the annual budget for 2009 — literally doesn’t add up.
Noting the multiple site redesigns to USASpending.gov in recent years, Miller said, “We’re beginning to worry that the administration is more interested in style than in substance.”