Here are a few stories from the data space that caught my attention this week.
How big data is transforming just about everything
Professor John Naughton took a look this week at how big data is transforming various industries that affect our daily lives.
He highlights finance, of course, which he says has been “pathologically mathematised;” marketing, for which there is more data about human behavior than we’ve ever had; and the very broad category of science. Naughton notes that researchers used to conjure up theories and look to data to support or refute; now, researchers turn to data to find patterns and connections that might inspire new theories. Naughton also looks at medicine, which is just on the brink of delving into the big data realm. He writes:
“Last week’s news about how Cambridge researchers stopped an MRSA outbreak affecting 12 babies in the Rosie Hospital by rapidly sequencing the genome of the bacteria illustrates how medicine has become a data-intensive field. Even a few years ago, the resources required to achieve this would have involved a roomful of computers and upwards of a week.”
Naughton addresses the use of big data in sports as well, speculating that baseball has been the sport most transformed by data. He’ll likely find agreement there. Barry Eggers goes into depth on the dramatic effect big data is having on baseball over at TechCrunch. He notes that simple data analysis of statistics, which baseball has embraced since its beginnings, has evolved into gathering mountains of unstructured data and employing Hadoop to gain new and better insights from data that isn’t part of the structured game information. Eggers writes:
“By having his data scientist run a Hadoop job before every game, [San Francisco Giants manager] Bruce Bochy can not only make an informed decision about where to locate a 3-1 Matt Cain pitch to Prince Fielder, but he can also predict how and where the ball might be hit, how much ground his infielders and outfielders can cover on such a hit, and thus determine where to shift his defense. Taken one step further, it’s not hard to imagine a day where managers like Bochy have their locker room data scientist run real-time, in-game analytics using technologies like Cassandra, Hbase, Drill, and Impala.”
The far-reaching applications of traffic data
As traffic data gathering techniques improve, all those real-time data points are spilling over to be useful to more than just commuters and traffic reporters. Derrick Harris looks this week at Inrix, a traffic data company. In addition to real-time traffic information, the company also gathers weather, sensor and accident data to predict the future of how traffic will respond, such as forecasting how long an accident will affect traffic flow. But this is just the beginning, according to founder and CEO Bryan Mistele, who told Harris that big data and crowdsourcing are bringing a “complete transformation” to how traffic data is used. Harris reports:
“Insurance companies can use the data to determine more-accurate rates, and some hedge funds are using Inrix’s data as a means for determining economic health — more drivers during rush hour means more people working, Mistele explained. During the London Olympics, data from mobile devices helped officials monitor the movement of people, not traffic, throughout the city.”
Harris also notes that this type of data could be useful in combatting pollution — which city officials in London also realized; they employed CityScan scanners during the 2012 Olympics to measure the effect of traffic emissions on air quality.
The collision of political policy and practice
There was much ado about the role of big data in President Obama’s recent reelection campaign, but a post at AdAge this week points out that the President isn’t putting his practice where his policy is. Kate Kaye writes that the Do Not Track system the administration supports “would throw a wrench into the data-collection tactics that empowered the campaign.” And the tracking didn’t cease after Obama’s win — Kaye reports:
“More than a week after the election, BarackObama.com houses an array of third-party tags that track users for ad targeting and campaign and site analytics. Yesterday, around fifteen ad company tags were surfaced by Evidon’s Ghostery software, including tags from BlueKai, which calls itself a ‘big data activation solution,’ and Appnexus, which among other things allows advertisers to use a variety of user behavioral data to target ads to those users on Facebook.”
Kaye says even if Do Not Track legislation passes, it likely won’t apply to political organizations or campaigns anyway. She notes: “For instance, political messages are exempt from CAN-SPAM laws, and political organizations are not restricted by the Do Not Call Registry.”
Tip us off
News tips and suggestions are always welcome, so please send them along.