"strataweek" entries

Strata Week: The data divide is growing

Is data collection entering discriminatory territory? Also, big data's role in crime fighting and its debut in the NBA.

Data mining opens new doors for discrimination, marginalization

In a post at Scientific American, Michael Fertik took a look at how Internet data collection practices are beginning to create an unequal — even discriminatory — online environment. Fertik writes:

“For most of the Internet’s short history, the primary goal of this data collection was classic product marketing: for example, advertisers might want to show me Nikes and my wife Manolo Blahniks. But increasingly, data collection is leapfrogging well beyond strict advertising and enabling insurance, medical and other companies to benefit from analyzing your personal, highly detailed ‘Big Data’ record without your knowledge. Based on this analysis, these companies then make decisions about you — including whether you are even worth marketing to at all.”

The consequences of such detailed data mining run deep. Fertik notes that advances in online data mining are enabling companies to “skirt the spirit of the law” and make discriminatory choices in who receives credit or loan offers, for example, by simply not displaying online offers to less credit-attractive users. “If you live on the wrong side of the digital tracks,” he says, “you won’t even see a credit offer from leading lending institutions, and you won’t realize that loans are available to help you with your current personal or professional priorities.”

Read more…

Strata Week: Data-driven politics

Big data's role in the US presidential election, trends shaping the future of data, and extenuating consequences of the Megaupload case.

Here are a few stories from the data space that caught my attention this week.

Big data, big politics

In the aftermath of the US presidential election, much attention has been focused on Nate Silver’s art of predicting the election results with data. Some looked at it from a coverage angle and how Silver’s work in the spotlight will affect the process of covering elections in the future. John McDermott reports at AdAge that Silver’s work will help shift the “nebulous aspects” of reporting that focus on “feel” and “momentum” to reporting that is anchored in facts and statistics. ComScore analyst Andrew Lipsman said to McDermott, “Now that people have seen [statistics-driven political analysis] proven over a couple of cycles, people will be more grounded in the numbers.”

Which also shows the attention Silver attracted may serve to help democratize big data as well. Tarun Wadhwa reports at Forbes that the power of big data has finally been realized in the US political process:

“Beyond just personal vindication, Silver has proven to the public the power of Big Data in transforming our electoral process. We already rely on statistical models to do everything from flying our airplanes to predicting the weather. This serves as yet another example of computers showing their ability to be better at handling the unknown than loud-talking experts. By winning ‘the nerdiest election in the history of the American Republic,’ Barack Obama has cemented the role of Big Data in every aspect of the campaigning process. His ultimate success came from the work of historic get-out-the-vote efforts dominated by targeted messaging and digital behavioral tracking.”

Michael Scherer at Time has an in-depth look at the role big data and data mining played in Obama’s campaign as well. Campaign manager Jim Messina, Scherer writes, “promised a totally different, metric-driven kind of campaign in which politics was the goal but political instincts might not be the means” and hired dozens of data crunchers to establish an analytics department. The team put together a massive database that merged information from all areas of the campaign — social media, pollsters, consumer databases, fundraisers, etc. — into one central location. Scherer reports: “The new megafile didn’t just tell the campaign how to find voters and get their attention; it also allowed the number crunchers to run tests predicting which types of people would be persuaded by certain kinds of appeals.”

Scherer’s piece is a fascinating look at how data was put to use in a successful presidential campaign. It is this week’s recommended read.

Read more…

Strata Week: Add structured data, lose local flavor?

Wikidata's structure vs. diverse knowledge, and a look at the many factors behind Netflix's recommendations.

A critic says Wikidata could undermine Wikipedia's localized information. Also, Netflix explains why its recommendation engine is much more complicated than most people realize.

Strata Week: New life for an old census

The 1940 census makes its data debut, and the White House shows off its data initiative.

In this week's data news, the National Archives releases the data from the 1940 Census, the federal government outlines its big data plans, and an app uproar leads to good thinking on privacy and sharing.

Strata Week: The allure of a data haven

Wikileaks and Sealand may not be a good match, ThinkUp reboots, Factual's CEO gets the NYT's attention.

In this week's data news, a look at Sealand as a potential data haven for Wikileaks, ThinkUp reboots, and the New York Times profiles Factual's Gil Elbaz.

Strata Week: Machine learning vs domain expertise

Debating the data skills of machines and experts, a key data move for Microsoft, and Google Analytics gets social.

This week's data news includes another look at the Strata Conference's debate about machine learning versus subject matter expertise, Raghu Ramakrishnan moves from Yahoo to Microsoft, and more social data comes to Google Analytics.

Strata Week: Infographics for all

A new infographic tool, San Francisco upgrades its open data efforts, and decades of Stephen Wolfram's data.

Visual.ly launches an infographic creation tool, San Francisco upgrades its open data initiative, and Stephen Wolfram offers a peek into more than 20 years of his personal data.

Strata Week: Profiling data journalists

The work of data journalists and a comparison of four data markets.

This week's data news includes a look at the work of various data journalists, Edd Dumbill surveys four data marketplaces, and the MIT Sloan Sports Analytics Conference experiences impressive growth.

Strata Week: Datasift lets you mine two years of Twitter data

Datasift offers more access to the Twitter archive, and a proposal for a data school.

In this week's data news, Datasift will offer deeper access to old tweets, P2PU and the Open Knowledge Foundation announce a School of Data.