Indexing the social signal

Charlene Li on the problems and possibilities of social search and realtime updates.

The rise of social media and real-time updates have put search engines in a bind. A web crawl that used to take a week is now irrelevant if it takes an hour, and how are engines supposed to extract meaning from 140 characters’ worth of information?

Altimeter Group founder Charlene Li (@charleneli) says these considerable problems also carry significant opportunities. In the following interview, Li explains how search will have to rise to these new challenges.


What it is about the social space that presents the most challenges for search?

Charlene LiCharlene Li: First of all, there’s the real-time aspects of social. Indexing is extremely difficult when you have information coming in and people responding back and forth to the information. Indexing real-time information is one of the biggest challenges because people are creating so much more content now than they did in the past.

The second thing is understanding the meaning behind the information. With PageRank, the more links that came into a piece of content the more meaningful and important it was. That works in a static web, and it tends to lean toward things that have better longevity. When things are coming in real-time, how do you determine whether content is important or relevant to a particular search query? How do you understand the social signal and all of the metadata that surrounds it? There’s very little metadata associated with a 140-character tweet. You can take information about the tweet’s author — how many followers they have, how many times they’ve been tweeting — but we don’t have much more information until people retweet that tweet. The question becomes, how do you balance social signals against other signals, like links or word repetition on a page?

Does social media search require more semantic parsing?

Charlene Li: If you look at only the intelligence in the page or the metadata of that page, semantics in and of itself isn’t enough. People use words differently, depending on their background or how they’re using it in the context of a conversation, so you have to look at it in the stream of everything that is being done and said. In many ways, the real-time and social web provide so much more context that could be used as the semantics of a page or the semantics of a word.

I think search is moving into this very interesting stage where it’s a combination of semantic web, the social web, and context. That combination will lead to a much more relevant experience.

What kinds of metadata does the social space provide that aids searches?

Charlene Li: There’s a number of areas where this applies. For example, you could look at the velocity of how quickly a Twitter user’s follower count is growing. That user may have had only 500 followers last week, but this week they have 1,000. Chances are, that person is doing something really interesting, and so it may be worthwhile to pay attention to the velocity of that user’s followers or the velocity of that user’s retweets.

There’s also variation within the tweets a particular person sends. When I tweet about technology and business, I get a lot of people responding to it and retweeting. When I tweet that I’m cooking dinner for my family, I don’t get a lot of retweets. People read it, but there’s not a lot of influence in those things. So you may want to tag me as an expert on certain things and give me credit and better relevancy on certain topics so you can understand the semantics of what I do.

Will the increased searchability of the social space change behavior? For example, will people think twice about letting people friend them on Facebook?

Charlene Li: As we become more public in the things that we talk about, we actually become more private. And the logic behind that is: “I used to talk about anything I wanted with my close friends, but as my ‘friend circle’ has increased, I’m going to be a little bit more circumspect about what I actually say and share.”

I think people increasingly realize that what you say and do on Facebook is not private. And, in fact, anything that you say at any time, even to a small circle of people, could probably come back to haunt you. So what do people deem safe to share? Is it the interesting things in life or the inconsequential things?

Photo credit: Photo used under Creative Commons license from Josh Hallett

Related:

tags: , , , , , ,
  • http://www.connectme360.com Brian Hayashi

    Some have said that the distinction between online and offline is disappearing. I don’t believe this is true.

    Realtime is actually splintering communication into the public sphere, where everyone is in everyone elses business, and a continuum of private backchannels, which are used to complain, celebrate, make arrangements, etc.

    Facebook had been on the forefront of developing privacy controls that enabled people to share what they wanted in realtime to the appropriate audience. The wheels have officially come off, and now there’s an opportunity for someone else to come in and do a better job.