Data is the real business model for social

IBM's Marie Wallace on the unrealized potential of social data.

As social media websites gather ever-growing data stores, they might be better served by finding ways to make profitable use of that data instead serving ads as their chief means of raising revenue. While the data might give them the information they need to serve more targeted ads — although in my experience they still have a ways to go with that — the real value in the site could be the data itself.

Of course, if social sites start selling data to the highest bidder that leaves open questions of data ownership and privacy and finding ways to strip personal identifiers.

Marie Wallace (@marie_wallace) is social analytics strategist for the IBM Collaboration Solutions division. She has spent more than a decade at IBM working on content analytics, and her experience uniquely positions her to address questions regarding big data, social media and analytics. Our interview follows.

Social media’s real value might not be in selling ads, but in the data they are collecting. Why do you think that is?

Marie Wallace: The reason ad targeting has worked so well for search is because it’s aligned and supportive to that particular activity; when I am searching for information about products or services I am happy to get ads that may help direct my search. Ads are somewhat analogous to a value-added service and social search makes the ads more personalized and relevant, which is why Google has invested so heavily in Google+.

The key is that in most cases ads only work in a search-like context, however with most social media sites people are not going there to search. They are going to converse with friends and family, which makes ads interruptive and frequently invasive. This is further exacerbated by mobile, where limited real estate makes ads even more offensive as they are distracting and clutter the screen. Social search is one example of a service that sits on top of social data, but there are a whole plethora of other services that social data can drive — from market research to consumer/brand engagement, social recommenders, information filtering, or expertise location.

It’s one thing to recognize the value of data, but how do you extract that value?

Marie Wallace: Extracting value from data requires a well-described set of scenarios with a clear understanding of what facts would be considered valuable for those scenarios. For example; when looking for a job there are a very specific set of questions that people want asked and answered: employee sentiment, corporate success (revenue, customers, products, growth), location, demographics, technologies, industries, skills, competitors, values, culture.

These are very different to the questions (and hence analysis) that might be pertinent to a different scenario. For example; when deciding where to go on holidays people are likely more interested in the location, activities, accommodation, weather, cost, demographics, or visitor sentiment. The key here is that analysis has to be not only domain-, but scenario-specific, which is why targeted specialist services like LinkedIn or Tripadvisor are always going to be able to deliver greater analytics value for the specific scenarios they support.

There are concerns on social networks about the sites abusing the data users are contributing. Is there a reliable way to anonymize data and deliver it in aggregate form that strips out individual user information?

Marie Wallace: I think the issue of privacy is a more complex problem, and while anonymizing user information is part of the solution, I don’t believe it’s at the heart of the problem. I believe the key social media challenges moving forward will be those of permission, trust, and transparency. People need to know exactly how their data is being used so that they can give permission for that use and that use only. For example; if I have a Tesco loyalty card and I trust them to respect my data, then I might be happy for them to see my Facebook Likes so they can provide me with more relevant special offers. Or if I register on LinkedIn I know that my data is going to be provided to recruiters and hiring companies, but I most definitely don’t want them to use it for any other undisclosed purpose.

There is also a likelihood that in the future we will see information brokers emerge, which provides a level of indirection (perhaps even obfuscation or anonymization) where they act as mediators on our behalf. This simplifies the authorization model, but does assume that we trust the information brokers and the models that they use for controlling access to our information.

Have the tools caught up with the amount and variety of data so that services like social networks can begin to manipulate the data they collect?

Marie Wallace: Having spent the last decade working on content analytics and semantic technologies, I can confidently say that many of the required tools have been around for years waiting for demand to catch up with supply. The advent of social media, alongside the growth of a new generation of big data platforms, now gives them the perfect business problem, dataset, and execution platform through which to shine. However, I believe the industry does have one significant gap in this otherwise rich landscape of technologies, and it’s a gap that I believe will impact the value that we can derive from these social networks.

It’s our handling of massive-scale networks that I believe is going to become a technological challenge as we move rapidly toward massive-scale graphs with social, semantic, temporal, and geospatial characteristics and as we look to apply complex analytics across these networks. There are a number of existing technologies from the linked data world that could morph to fill this gap, or alternatively there is a new generation of graph databases and analytics algorithms emerging focused on tackling this specialized problem. Only time will tell in terms of which technologies will emerge the winners.

What kinds of uses could you envision social sites finding for their data?

Marie Wallace: For the medium-term, I suspect that we will continue to see social analysis being driven by the marketing, sales, and support organizations. Social data will be used for market research, to help expand sales channels, and to improve how brands interact with customers.

As we move from marketing to sales to support, the type of analysis becomes more complex and this will put pressure on the algorithms being used to evaluate the data and derive insights; identity and entity disambiguation, micro-segmentation, influence analysis, sentiment, intent, network information flow, and community dynamics. A growing number of social applications will emerge, each delivering niche value to consumers and generating specialist data for brands. This ecosystem of social networks will drive consumer-brand engagement; everything from consumer feedback systems, customer support, to product and service innovation. Brands will move away from a focus on passive listening/monitoring to one of active engagement, and this will require a broader range of analytics in order to optimize and operationalize those interactions.

Further out I see us expanding the personalization that can be realized. Social data will become increasingly important for personalizing every search and navigation experience from Google, Amazon, Netflix, to Expedia, however search is only the tip of the iceberg. I anticipate that in the longer term social data will be used to personalize a whole range of experiences that cross the physical/digital divide; transforming how we shop, what we think, how we learn, and ultimately how we live.

Just imagine what will happen when we intersect the social web, the semantic web, with the web of data. Then we will really see personalization take on a whole new form!

This interview was edited and condensed.

Strata Conference + Hadoop World — The O’Reilly Strata Conference, being held Oct. 23-25 in New York City, explores the changes brought to technology and business by big data, data science, and pervasive computing. This year, Strata has joined forces with Hadoop World.

Save 20% on registration with the code RADAR20

Related:

tags: , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.