Digging The Madness of Crowds

Earlier today, O’Reilly found itself at the center of a controversy on the popular news site, digg.com. Steve Mallett, O’Reilly Network editor and blogger, was very publicly accused, via a Digg story, of stealing Digg’s CSS pages. The story was voted up rapidly and made the homepage, acquiring thousands of diggs (thumbs-up) from the Digg community along the way. There was only one problem: Steve didn’t steal Digg’s CSS pages.

The real story is that Steve’s iTunesLove.com and LinuxFilter sites are built on Pligg, an open source project that recreates the user, story, and voting backends behind Digg. Pligg in turn is based on a Spanish Digg clone, Menéame, and Menéame is where the copying originally took place. Pligg copied Digg’s CSS files, so Steve’s sites had them too. Steve had assumed the open source code didn’t violate copyrights, as we all do, and was surprised to learn otherwise. Things were muddied because Steve had been automatically [update: Steve says there were no bots involved] submitting stories from his other sites to Digg (because a Digg front-page story gets a lot of traffic), which leant credence to the claim of “spammer” made by the poster of the “Steve’s stealing Digg’s CSS” post. The main claim of stealing CSS was superficially true, but substantially false.

In the meantime, of course, there’s a small matter of hundreds of thousands of readers and thousands of active voters voting up the article about how “O’Reilly writer Steve Mallett” is a thief and a spammer. Only if you took the time to read through the hundreds of comments do you get to intrepid readers who tracked the copying back through Pligg (kudos to Digg reader caldroun, who was the first to identify pligg). But it was obvious by the rapidly-increasing Digg count that nobody was doing research (or even reading to see whether the claim had been refuted), they were simply indicating their condemnation of someone who had transgressed against the Digg community. The anonymous and quite pointed (“negative, but apparently true”, as one person put it) article was designed to raise maximum ire in the minimum of words.

This is a classic Web 2.0 problem: it’s hard to aggregate the wisdom of the crowd without aggregating their madness as well. In this case, the situation was amplified because it wasn’t just any site that Steve was accused of ripping off, it was the very site that the community belonged to and identified with. Every news site figures out what to do when thumbs-up turns to bums-up: Slashdot has issued retractions, often updates stories, and regularly posts collections of “further details on …” notes. BoingBoing updates stories as soon as new facts come to hand, even if it means they’ve admitted “whoops, that wasn’t true at all!”. It’s more complex with community sites, because editors don’t make the editorial decision to run a faulty story but nonetheless have to live with its consequences. And everyone has to deal with the situation when their site has been used to further someone else’s agenda. Digg is still learning how to deal with this, and I look forward to seeing how they tackle it in the future.