Homophily in Social Software

The Washington Post has a brief article called “Why Everyone You Know Thinks The Same As You“. In short, you hang out with people who are like you, a phenomenon known as homophily. This happens online, and indeed the Internet can lower the costs of finding people like you. But homophily raises the question for social software designers of how much they should encourage homophily and how much they want to mix it up.

Consider sites like Findory, where machine learning techniques present news to you based on news you’ve said you liked. It’s often been asked whether this filtering just encourages people to see the news that supports their prejudices and never see news that counters them. Indeed, people were saying this about Usenet killfiles in the 80s and 90s. As social software and recommendations engines become part of the fabric of Web 2.0, the issues of homophily become important.

Designers first need to decide whether homophily is a a feature or a bug. Life is easy when you’re unchallenged: this is why people read the New York Times or watch Fox News or even just watch the 5pm news (the one with the deaths taken out) instead of the 7pm (the one that’s all death). Do you accept that your audience wants to be around people like them and that your job is to make that as easy as possible? NYT and Fox News show that it can certainly be a path to financial success.

If you don’t buy into homophily completely, what can you do? Recommendations increase your pool of interest in very short steps. To break homophily, recommend something for reasons other than “this meshes very tightly with your profile”. This seems heretical at first: the whole logic behind recommendations is to guess at items the user will probably like. But it has to happen. For you to identify their complete region of interests, you necessarily have to show them things in and out of that region. If you prematurely narrow in, you’ll end up only showing them stories about melting Antarctic ice shelves without connecting to the rest of environmental, travel, or scientific stories that they’re really interested in. The best way to make those connections is to mix it up.

Doing this creates serendipity: pleasantly surprising the user. For example, don’t show just the top 10 most similar items in your recommendations list, but show the eight most similar and two from the mid-range. Or call the “less relevant but also likely to be interesting” results out like you’re advertising them: put a heading like “Take a walk on the wild side” or “Break out” on top and act like it’s a feature you’re offering, not a bug you’re fixing. </p.

Breaking out of the tight circle of self-similar recommendations is a feature. I tried pandora.com and listened to all the bluegrass music that was like the music I like, until Pandora had no more that I wanted to listen to. It was briefly a bit like a poker machine–I spent another fifteen minutes trying to find new music before I finally realized there was no jackpot to be held and left. Pandora never said “look, I’m out of music to recommend to you–perhaps you’d like to head off on these related jags?” Don’t make your software an exhaustible pool of narrow recommendations.

Another strategy that works is to take a leaf from Malcolm Gladwell’s “Tipping Point” and find the connectors–people who join homophilic clusters. This is a feature a bit like “people who liked this story also liked” but it specifically eliminates your personal history and preferences–the point is to use the current object (person/photo/music genre/news story/…) as a gateway out of your shallow meme pool. Pandora could say “the genres most liked by people who like this genre are”, for example.

With social software, I think there’s a lot of room to exploit profiles. Think of social software as software that connects people through activities. The activities are necessarily around some common shared interest but they function as walls around those interests. Let people build out profiles to express the full range of their interests and is start conversations about other interests. Look at MySpace profiles for examples of how keen people are shove every facet of their life into a text field, preferably with blinking orange text and accompanying 50cent soundtrack.

The methods I gave earlier for advertising interests (“Did you know … Joshua Schachter also competitively whittles Persian cats”) let you learn more about the people you already know. That’s an important difference from the prototypical serendipitous recommendation: “you and X should become friends!” As Liz Goodman pointed out, we’re grown-ups and have lots of friends already. What else can the software do for us besides making it even harder to keep up with all the people we know? People are conversational animals, give them more things to talk about.

Another way to build in serendipity is to have pivotal navigation: tags, top ten lists, and Flickr’s interestingness measure are all ways to break people out of whatever group they’re in and take them to something new. Links are at the heart of this: we’ve all been lost in clicking our way through a drunkard’s walk of the Internet at one point or another. Inspire that in people: build those links and the metadata behind them into your site from the get-go.

Your challenge for this week: spot the social software features of a site you use that encourage homophily, and figure out two ways to break that homophily. Post your suggestions in the comments and on Friday I’ll send a free book from our new releases list to what I think is the best idea (O’Reilly books only, sorry–my astounding freebie powers bounce off the kryptonite walls of Paraglyph, the Prags, and Syngress).