Strata Week: Data brokers know more about us than we know

Data brokers, workplace sensor studies, unreported drug side effects revealed in search data, and the dark side of big data.

The lowdown on data brokers, and the use of sensor data in the workplace

ProPublica’s Lois Beckett takes a look this week at data brokers. She says that though Congress is making moves to make such companies give consumers more control over their data and what happens to it, many people not only don’t know these data brokers exist, but they also don’t know the extent of the data gathered and how it’s used.

Beckett takes a step-by-step look at who these companies are, how much they know, how they get our data and from where, what kind of data they’re allowed to collect, how they use the data, and much, much more. She says most of the time, consumers have no idea their data has been purchased. For instance, “When you’re checking out at a store and a cashier asks you for your Zip code, the store isn’t just getting that single piece of information,” she writes. “Acxiom and other data companies offer services that allow stores to use your Zip code and the name on your credit card to pinpoint your home address — without asking you for it directly.”

It’s possible but very, very difficult for consumers to stop the companies from collecting and sharing their data, Beckett notes. Though most data brokers have an “opt-out” policy, consumers would “need to know about all the different data brokers and where to find their opt-outs” — information that most consumers don’t have and don’t know how to find. You can find Beckett’s full report at ProPublica — it’s this week’s recommended read.

In related news, Rachel Emma Silverman at the Wall Street Journal takes a look at the use of sensors and data gathering practices in the workplace. “As big data becomes a fixture of office life, companies are turning to tracking devices to gather real-time information on how teams of employees work and interact,” she writes. “Sensors, worn on lanyards or placed on office furniture, record how often staffers get up from their desks, consult other teams and hold meetings.”

Though there’s a fine line between big data and big brother, Silverman says, “[s]ensor proponents … argue that smartphones and corporate ID badges already can transmit their owner’s location” and most companies will allow workers to opt out of the sensor studies. Silverman reviews a few real-world sensor study cases, along with the results and insights gleaned. You can read her full report at the Wall Street Journal.

Study shows search data reveals unreported drug side effects

John Markoff reports at the New York Times on a study published this week that shows by data mining Internet search data, scientists from Microsoft, Stanford and Columbia University have been able to discover unreported prescription drug side effects before the FDA’s warning system flagged them. Markoff writes:

“Using automated software tools to examine queries by six million Internet users taken from web search logs in 2010, the researchers looked for searches relating to an antidepressant, paroxetine, and a cholesterol lowering drug, pravastatin. They were able to find evidence that the combination of the two drugs caused high blood sugar.”

Users who opted to participate in the study installed a browser toolbar that gathered anonymized data, Markoff reports. Using data from 82 million drug-, symptom- and condition-related searches in 2010, researchers were able to cross-reference searches for “paroxetine” and “pravastatin” with the number of times users would also search for “hyperglycemia” or any of its 80 or so symptoms.

“They determined that people who searched for both drugs during the 12-month period were significantly more likely to search for terms related to hyperglycemia than were those who searched for just one of the drugs,” Markoff reports. He also notes that the searches for symptoms relating to both drugs occurred within a short period of time — 30% the same day, 40% the same week, and 50% the same month. You can read Markoff’s full report at The New York Times.

Keeping an eye on big data’s dark side

Viktor Mayer-Schönberger’s and Kenneth Cukier’s new book Big Data: A Revolution That Will Transform How We Live, Work, and Think was released this week. The duo addressed one of the topics in their book — predicting and punishing crime before it happens, ala Minority Report — in a post at PopSci. They warn that along with all the benefits we’re reaping from big data, we need to be conscious of big data’s dark side, too:

“Already we see the seedlings of Minority Report-style predictions penalizing people. Parole boards in more than half of all U.S. states use predictions founded on data analysis as a factor in deciding whether to release somebody from prison or to keep him incarcerated. A growing number of places in the United States — from precincts in Los Angeles to cities like Richmond, Virginia — employ ‘predictive policing': using big-data analysis to select what streets, groups, and individuals to subject to extra scrutiny, simply because an algorithm pointed to them as more likely to commit crime.”

They warn further that it won’t stop there — law enforcement will eventually attempt to predict crime on individual levels and, ultimately, to use big data to prevent the crime in the first place. You can read more from Mayer-Schönberger’s and Cukier’s piece at PopSci.

Cukier also sat down with O’Reilly Radar online managing editor Mac Slocum at the recent Strata Conference in Santa Clara to talk about government’s use of big data, regulations and restrictions, and what needs to be done to keep data open. You can watch their interview in the following video:

Tip us off

News tips and suggestions are always welcome, so please send them along.

Related:

O’Reilly
Strata Conference
— Strata brings together the leading minds in data science and big data — decision makers and practitioners driving the future of their businesses and technologies. Get the skills, tools, and strategies you need to make data work.

Strata Rx Health Data Conference: September 25-27 | Boston, MA
Strata + Hadoop World: October 28-30 | New York, NY
Strata in London: November 15-17 | London, England

tags: , , , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.