Tue

Aug 15
2006

Tim O'Reilly

Tim O'Reilly

The System is Smarter than the Raters

I was talking the other day with Peter Norvig of Google. He tells an interesting story. Google hires temp workers to evaluate search results. Often when the raters and the system disagree, there's a bug, which Google fixes by tweaking the algorithms. But sometimes, the system is smarter than the raters. A good example: Glacier Bay. The human raters were convinced that Glacier Bay National Park was the right result. But a closer examination of what John Battelle calls "the database of intentions" showed that the system was indeed right: most users were looking for the Glacier Bay Faucet Company, which none of the raters had ever heard of. Google solved the problem by splitting the result page.

(Interestingly, they also do this kind of treatment for flicker because of flickr. Apparently these split results are available for 2-3% of searches.)


tags:   | comments: 5   | Sphere It
submit:

 
Previous  |  Next

0 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/4859

Comments: 5

  Nick Carr [08.15.06 06:44 AM]

The oddest (to me) split result page I've come across recently is for "chocolate," which splits between the confection and "chocolate phone."

But while the system may be smarter than the raters in some cases, the proliferation of copycat affiliate sites in the Google results for many common terms related to consumer products leads me to believe that the raters would be smarter than the system in other cases. People are pretty smart at sniffing out an attempt to game a system, while the system itself is often totally oblivious (hence the gaming incentive).

  Kevin Farnham [08.15.06 10:28 AM]

The "Glacier Bay" example points out the separation between the degree of recognition of varying meanings of a term and the degree to which people would enter that term into a search engine considering each of those meanings. People who have a problem involving Glacier Bay faucets, even though they are a small subset of the overall population, are much more likely to type "Glacier Bay" into a web search engine than is the general person who's heard of "Glacier Bay National Park." The person with the broken faucet has a critical problem to solve, unlike the potential vacationer.

In other words, it's not the most common meaning of a term that matters; it's which meaning is likely to induce the most people to go to their computer, go to a search engine site, and type the term into the search window. That's the meaning search engines need to isolate in order to be effective.

  Xavier Cazin [08.15.06 10:09 PM]

It seems that Google also takes the user origin in account before deciding the split; when clicking on your Glacier Bay link from a computer located in France, no split occurs (let's assume that almost no one in France heard about the Glacier Bay Faucet Company), although Flicker does trigger the split.

  nick [08.16.06 12:23 PM]

re: chocolate the LG 'Chocolate' (or KG800 if you want to be boring) is just about the hottest mobile phone in Europe right now. I can see why 'the system' would want to split the page there. It'd be interesting if they start to split the ads too though, so cadbury's and hersheys can go at the top and phone dealers under the split.

  Alicia [08.16.06 04:18 PM]

I don't think the system is "smarter" but rather, the system is geared towards pushing people towards products and commercial sites at the expense of information that is not necessarily commercial, or geared to a cash transaction. The "chocolate" search in particular seems indicative of that, and if this is the direction Google is headed, I think it will be _less_ useful to the average user.

Post A Comment:

 (please be patient, comments may take awhile to post)






Type the characters you see in the picture above.