What are the key data categories companies want to control?

Help us build a data layer for the Web 2.0 Summit map.

This post was originally published on John Battelle’s Searchblog (“Building A New Map And I Need Your Help: What Are The Key Categories of Data In Today’s Network Economy?“).

Web 2 Summit MapMany of you probably remember the “Points of Control” Web 2 Summit Map from last year. It was very well received. Hundreds of thousands of folks came to check it out, and the average engagement time was north of six minutes per visitor. It was a really fun way to make the conference theme come to life, and given the work that went into its creation, we thought it’d be a shame to retire it simply because Web 2 has moved on to a new theme:

For 2011, our theme is “The Data Frame” — focusing on the impact of data in today’s networked economy. We live in a world clothed in data, and as we interact with it, we create more — data is not only the web’s core resource, it is at once both renewable and boundless.

Consumers now create and consume extraordinary amounts of data. Hundreds of millions of mobile phones weave infinite tapestries of data, in real time. Each purchase, search, status update, and check-in layers our world with more of it. How our industries respond to this opportunity will define not only success and failure in the networked economy, but also the future texture of our culture. And as we’re already seeing, these interactions raise complicated questions of consumer privacy, corporate trust, and our governments’ approach to balancing the two.

How, I wondered, might we update the Points of Control map such that it can express this theme? Well, first of all, it’s clear the game is still afoot between the major players. Some boundaries may have moved, and progress has been made (Bing has gained search share, Facebook and Google have moved into social commerce, etc.), but the map in essence is intact as a thought piece.

Web 2.0 Summit, being held October 17-19 in San Francisco, will examine “The Data Frame” — focusing on the impact of data in today’s networked economy.

Save $300 on registration with the code RADAR

Then it struck me — each of the major players, and most of the upstarts, have as a core asset in their arsenals data, often many types of it. In addition, most of them covet data that they’ve either not got access to, or are in the process of building out (think Google in social, for example, or in deals, which to my mind is a major play for local as well as purchase data.) Why not apply the “Data Frame” to the map itself, a lens of sorts that when overlaid upon the topography, shows the data assets and aspirations of each player?

So here’s where you come in. If we’re going to add a layer of data to each player on the map, the question becomes — what kind of data? And how should we visualize it? My initial thoughts on types of data hew somewhat to my post on the Database of Intentions, so that would include:

  • Purchase Data (including credit card info)
  • Search Data (query, path taken, history)
  • Social Graph Data (identity, friend data)
  • Interest Data (likes, tweets, recommendations, links)
  • Location Data (ambient as well as declared/checked in)
  • Content Data (journey through content, likes, engagement, “behavioral”)

Those are some of the big buckets. Clearly, we can debate if, for example, identity should be its own category, separate from social, etc, and that’s exactly the kind of argument I hope to spark. I’m sure I’ve missed huge swaths of landscape, but I’m writing this in a rush (have a meeting in five minutes!) and wanted to get the engine started, so to speak.

I’m gathering a small group of industry folks at my home in the next week to further this debate, but I most certainly want to invite my closest collaborators — readers here at Searchblog, to help us out as we build the next version of the map. Which, by the way, will be open sourced and ready for hacking …

So please dive into the comments and tell me, what are the key categories of data that companies are looking to control?


tags: , ,
  • Web 2.0 was all about making the web a bidirectional engagement medium, and all this engagement resulted in an incredible accumulation of data from users and about users, and as John points out, we now need to classify and organize all this data to make use of it more easily.

    In addition to all this consumer-centric data, let’s not forget that there are lots of additional classes of data that have moved onto the web – for example, government data, public records, prices – data that was previously locked in databases and file systems behind firewalls. I expect that being able to access and normalize these additional classes of data will be a key ingredient for additional insights by correlating it with all this not-previously-available consumer-centric data that John is discussing in this blog post.

    Unexpected correlations between disparate data sets lead to unexpected insights!

    Timo Kissel
    Fetch Technologies

  • I think that rights over data is a massive area that is only just starting to be looked at. It goes beyond the high level concepts of copyright or EULA’s or Public Domain etc. There’s issues about what a particular datapoint can be used for. What right does one have to infer properties of a country for example (via the multitude of satelite data) and then make that public with the possible effects on trade outputs for that country? It ties in with sourcing of a datapoint and the attribution of that datapoint. At the risk of sounding boring, Metadata about data is an absolutely vital layer and one that I’m sure a number of players would like to gain control over.