Here are a few stories from the data space that caught my attention this week.
Presidential candidates are mining your data
Data is playing an unprecedented role in the US presidential election this year. The two presidential campaigns have access to personal voter data “at a scale never before imagined,” reports Charles Duhigg at the New York Times. The candidate camps are using personal data in polling calls, accessing such details as “whether voters may have visited pornography Web sites, have homes in foreclosure, are more prone to drink Michelob Ultra than Corona or have gay friends or enjoy expensive vacations,” Duhigg writes. He reports that both campaigns emphasized they were committed to protecting voter privacy, but notes:
“Officials for both campaigns acknowledge that many of their consultants and vendors draw data from an array of sources — including some the campaigns themselves have not fully scrutinized.”
A Romney campaign official told Duhigg: “You don’t want your analytical efforts to be obvious because voters get creeped out. A lot of what we’re doing is behind the scenes.”
The “behind the scenes” may be enough in itself to creep people out. These sorts of situations are starting to tarnish the image of the consumer data-mining industry, and a Manhattan trade group, the Direct Marketing Association, is launching a public relations campaign — the “Data-Driven Marketing Institute” — to smooth things over before government regulators get involved. Natasha Singer reports at the New York Times:
“According to a statement, the trade group intends to promote such targeted marketing to lawmakers and the public ‘with the goal of preventing needless regulation or enforcement that could severely hamper consumer marketing and stifle innovation’ as well as ‘tamping down unfavorable media attention.’ As part of the campaign, the group plans to finance academic research into the industry’s economic impact, said Linda A. Woolley, the acting chief executive of the Direct Marketing Association.”
One of the biggest issues, Singer notes, is that people want control over their data. Chuck Teller, founder of Catalog Choice, told Singer that in a recent survey conducted by his company, 67% of people responded that they wanted to see the data collected about them by data brokers and 78% said they wanted the ability to opt out of the sale and distribution of that data.
Regulators also want Google to be more transparent with the kinds of data it’s collecting and how that data is being used. Pfanner and O’Brien report:
In other Google news, the company opted to become very transparent this week with its data centers, adding photos from some of its data centers to the data center section of its website. The new transparency coincided with a report from Steven Levy at Wired, looking at how Google builds and operates its data centers, along with an account of his tour through the facility in Lenoir, N.C. (you can read his account here). While personal tours aren’t available for everyone, you can now take a virtual tour of the data center in Lenoir as Google has given it the Street View treatment:
Hadoop and the BI industry
“Is it a MapReduce framework for heavy-duty batch processing? Yes. But can it also be the engine of high-speed, interactive analytics products that look to do for unstructured data what massively parallel analytic databases do for structured data? As it turns out, the answer might be ‘yes’ again.”
Harris looks at several companies that all made BI moves in this direction this week alone: Hadapt, Birst, Splice Machine, and Teradata. He points out that we’re at the very brink of possibilities of aligning Hadoop and BI, and says we likely haven’t seen anything yet.
Study results from Gartner Research published this week seem to support Harris’ sentiment. The study looked at how big data will be driving billions of dollars in spending in the next few years. Alex Williams at TechCrunch highlighted several key points from the research, noting that the influx of big data “will force a change in products, practices and solutions,” and that “[m]aking big data something that has a functional use will drive $4.3 billion in software sales in 2012.”
Tip us off
News tips and suggestions are always welcome, so please send them along.