Previous  |  Next


Dec 14

Andy Oram

Andy Oram

Reputation: where the personal and the participatory meet up (installment 2 of 4)

(Please read installment 1 before this installment. Several of the comments on the first installment are directly relevant to upcoming material in later installments.)

Accessibility: the problems

Information equity is certainly a major problem today. (One audience responded to Hoffman by suggesting, "Nobody should know more about you than you know about them.") To digress for a moment, this is one of the outrageous aspects of a recent court ruling that email users have no reasonable expectation of privacy. This apparently overrules an opinion issued four months earlier by the court.

In addition to the damage done to civil rights by this ruling, it is supremely cynical because it doesn't apply to you or me. According to the court, you have no right to hide your email from me. But I can't act on that. It's a doctrine that, conveniently, only governments (and ISPs) can benefit from.

At the conference, by and large, everybody agreed that your data should be available to you and that the heuristics used to generate reputation should be open. But participants pointed out that search engines are the only really robust reputation systems available, and proposed that they work only because they keep their heuristics secret.

Can we ever design a transparent system that resists fraud and gaming? Ashish Goel, who does Operations Research at Stanford University, says no: "It's an intractable problem to detect collusion that inflates reputation." Yet he still supports transparent reputation systems.

Darko Kirovski of Microsoft Research further pointed out that a reputation system can't predict fraud because fraud is a sudden shift in behavior: fraudsters behave honorably up to the moment when they strike their victim.

Vipul Ved Prakash described Vipul's Razor, a distributed spam-blocking system that has proven to be popular, effective, and resistant to attack. It works because everybody online can identify unsolicited bulk email, and because they mostly agree on what's spam and what's not. People simply mark mail as spam when they receive it, and when a critical mass builds up identifying a particular email as spam, other participating systems delete it.

Prakash created a reputation community using a classic technique of seeding it with trusted people. It's very hard to bootstrap a stable and trustworthy reputation online without such seeding.

New people who consistently rate email like the trusted community get added to that community. On the other hand, anyone who rates an email message differently from the trusted community gets downgraded severely. Over time, a an extremely reliable set of trusted people who act very quickly to flag spam builds up. Spammers who try to break into the trusted group have a high barrier to entry (it requires many accurate ratings) and are dumped quickly when they stop rating spam correctly.

In general, panelists argued that computational systems are unlikely to create better ratings than human beings, and human beings are notoriously inconsistent in their ratings. But as Goel says, computational systems can aggregate human ratings to facilitate their perusal and application.

Changeability: the problems

It's obvious that people, hotels, web sites, etc. change over time and need to be re-examined. And those viewing the information also change, so the value of information degrades over time even if a rating is still correct. But even Hoffman's Rapleaf doesn't let you change a comment after you post it. (You can, however, add new comments to adjust your rating.)

Changing information can be hard. For example, public-key certificate systems include revocation protocols, but they're rarely used. Like any distributed information, certificates are resistant to attack by antibodies once they enter the Internet's blood stream.

There is also a social dimension to changing information. Who says what's right and wrong? Just because a professor doesn't like your assessment on doesn't mean she has a right to remove it. Jonathan Zittrain (of Oxford University and Harvard Law School's Berkman Center) pointed out that the Berkman Center's StopBadware site (used by Google to warn people away from sites infected by spyware) is a reputation engine of a sort. It's obviously one that many people would like to eliminate--not only sites being accused of infection, but the spammers and others who broke into those sites in the first place.

Debates that sprang up in the 1980s (or even earlier) about privacy--privacy versus free speech, opt-in versus opt-out--have returned as overgrown brambles when reputation becomes an issue.

Nobody at the symposium offered a great solution to the balance between privacy and free-speech, which have to be rejudged repeatedly in different contexts. Rebecca Tushnet of Georgetown University Law Center pointed out the disparity between provisions for copyright holders and provisions for others who claim unfair behavior on the part of online sites. The safe-harbor provision of the DMCA requires ISPs to take down content immediately when someone claims copyright over it (and the person who put up the content rarely succeeds in getting it restored). But a well-known provision upheld as part of the Communications Decency Act (USC Title 47, Section 230) exempts ISPs from being liable for content posted by users.

So you're much better off claiming copyright on something than trying to get an ISP to take down a defamatory or threatening post. Tushnet would modify both laws to move them somewhere in between these extremes.

On the other hand, we don't always have to assume opposing and irreconcilable interests. Zittrain suggested that a lot of Internet users would respect a request to refrain from propagating material. He envisions a protocol by which someone says, "I am posting a picture of myself in drunken abandon to amuse my friends on Facebook, but please don't publish it in a news article." More generally, a lot of people enamored of the mash-up culture grab anything amusing or intriguing to incorporate into their work, but would be willing to leave something alone if they could tell the originator wanted them to.

Zittrain pointed to robots files and the Creative Commons as examples of voluntary respect for the rights of authors. He also said that the private ownership of social networking and blogging sites--and the consequent ability to enforce terms of service--can be used for good or ill, and that in this case some protocols for marking content and policies for enforcing them could be beneficial to the user privacy.

Hoffman pointed out that privacy advocates lobby for opt-in systems, because few users care enough about privacy to opt out of data collection. ("If consumers are responsible for protecting their privacy, there is no privacy.")

The latter point was underlined by a fascinating research study presented by Alessandro Acquisti of Carnegie Mellon (Information Technology and Public Policy). When survey takers were presented with a detailed privacy policy assuring their confidentiality, they were far less likely to volunteer sensitive personal information than when they were given the survey with weak confidentiality guarantees or no guarantees at all. In other words, people didn't think about the safety of providing personal information until Acquisti's researchers forced them to confront it.

Several panelists, including Mozelle Thompson, a former commissioner on the Federal Trade Commission and an advisor to Facebook, confirmed that consumers need to be protected by privacy laws, just as they need seat-belt laws. When Thompson was on the FTC, it asked Congress to pass comprehensive privacy legislation, but of course they didn't. Even the European countries, known for their strong privacy directives and laws, "put themselves in a box" according to Thompson, because they focused on individuals' self-determination.

So an opt-in world is necessary to protect privacy, but Hoffman pointed out that opt-out is required to develop most useful databases of personal information. If search engines depended on opt-in, we wouldn't be able to search for much of value.

Nevertheless, our current opt-out regime is leading to such heights of data collection--and eventual abuse--that Hoffman believes a reaction is imminent. Either government regulation or a strong consumer movement will challenge opt-out, and we need to offer a well-though-out combination of regulation and corporate good behavior in order to avoid a flip to a poorer opt-in world.

tags:   | comments: 2   | Sphere It


0 TrackBacks

TrackBack URL for this entry:

Comments: 2

thacker [12.14.07 09:21 PM]

Slightly off-topic but:

[...]this is one of the outrageous aspects of a recent court ruling that email users have no reasonable expectation of privacy.
So you're much better off claiming copyright on something than trying to get [...]

Place a copyright notice on e-Mail.


Very good and thought provoking series of installments, so far. Am anxiously awaiting the remainder and what conclusions and solutions you, hopefully, present.

Thank you.

Thomas Lord [12.15.07 12:20 AM]

Thacker --

Am anxiously awaiting [...] solutions

Pull the rug out from brokered advertising as a useful way to monetize web services. That's the keystone, for now. (Then you have to privatize application protocols.)


Post A Comment:

 (please be patient, comments may take awhile to post)

Type the characters you see in the picture above.

Subscribe to this Site

Radar RSS feed