Search Engine Spam?

I just read Phil Ringnalda’s comments claiming search engine spam by advertisers on O’Reilly sites. This was a bit of a shock to me. Since then, I’ve spent a bunch of time talking to people about Phil’s complaints, looking into what we’re doing and what I think we should do. It’s clearly a complicated issue, and my opinion has changed a couple of times as I’ve gotten more information. Apologies for the length of this entry. I’m still in fact-finding mode, but wanted to share my process rather than waiting till I have a complete solution.

 

The facts as I have discovered them:

  • A number of O’Reilly Network sites, including www.oreillynet.com, xml.com, perl.com, and others (but not the oreilly.com corporate site) have been running a set of text ads for hotel search sites, in a block entitled “Travelling to a Tech Show?” in the lower left column of the nav bar. In addition, there is a small separate block just below that, labeled “site supported by…” or “Sponsored Links” containing links to one or two other sites, with ads such as Computer Community, or, less obviously targeted to the O’Reilly audience, Mortgage Refinancing or Health Insurance.

  • These appear to be legitimate ads, albeit not specifically targeted to the O’Reilly tech audience. There are no links to porn, drugs, gambling, or scams. In fact, as our ad manager noted, “some of those hotel prices look pretty good.” The ads do in fact point to sites that provide the advertised service. (The one exception that I found in clicking through on the links was one to a site that was labeled Web Directory, and on first click appeared to be a directory, but on second click down into any category, simply contained ads for a book on search engine optimization. That one I’m clear about: it’s a deceptive ad, and needs to come off the site right away. Another so-called Web Directory is indeed a directory, but the only content when you get to the bottom of each category is a set of Google Adsense advertisements for the category. (Question: if Google is opposed to this type of site, as many of those commenting on the issue claim, why is Google providing these ads?))

  • Many of the text ads on our sites are placed by a company called 3Genius, but many of them come from individual advertisers via our normal ad sales process. Our ad team apparently restricted the range of possible links to travel sites (which seemed plausibly relevant) and a couple of other areas, though affiliate sites such as Osdir.com, servlets.com, and linuxquestions.org, which are O’Reilly branded but which we do not own, have less restrictive policies, which is why you will see ads for cuban cigars or Jack Daniels on osdir.com.

  • Phil refers to the WordPress case discussed by Andy Baio back in March. WordPress was hiding non-visible links to advertiser sites on the WordPress site in order to drive up advertisers’ PageRank without that being apparent to anyone. What we’ve been doing is different in the significant respect that the links we sell for advertisers are clearly visible on our site, with link keywords that match the content of the destination. There may be cases on other sites where hidden link farms are being used solely to game the search market, but on O’Reilly sites, these are all visible links — just like any other paid text ads.

  • That being said, it’s become clear to me on investigation that these folks are indeed paying us for our Google rank, and not just for clickthroughs. We just aren’t targeted enough for their ads to be justified on a click-through basis. What’s more, using Google’s link: keyword to check for top links to these particular advertisers shows that the O’Reilly sites they advertise on are among their chief link sources. They aren’t getting independent links from users. In short, these advertisers are using O’Reilly and other highly ranked sites who accept their advertising to improve their chances of being discovered via search engines, rather than in quest of direct click throughs (although those may also provide some value for their ad buy.)

  • Google has an authorized way for people to show up arbitrarily high on searches: i.e., to pay for relevant Adwords. However, nearly all of the terms used in these links are quite expensive. So advertising on a site with a high page rank instead of via Google Adwords is a way of arbitraging the relative cost of advertising on the two sites. However, it has a downside in terms of the search engine user experience. The ad shows up as a sponsored link on the originating site, but as a legitimate result in the search engine.

So there’s the heart of the question: is it appropriate for a site to monetize its page rank as well as its page impressions?

 

It’s pretty clear that the practice of “cloaking” — that is, hiding links so that you’re selling only the page rank — is illegitimate. But what if someone pays you for a real ad, even if you know that they are paying you primarily because of your page rank rather than your targeted audience? As long as there’s no deception as to the nature of the sponsored link, and a legitimate opportunity for click through, isn’t this still an ad?

That leads to a whole nest of hard questions: Where are the boundaries between legitimate “search engine optimization” to help people find stuff that they will appreciate, and “search engine gaming”, to inflate the rank of sites that are less useful? Whose responsibility is it to solve this problem? Should web sites turn away advertisers just because they are performing arbitrage on Google and other search engines? Or is it the search engine’s responsibility to adjust their heuristics to counteract any attempts to game the system? Or both? Is it legitimate for a site to improve its own user experience by hosting small, well-paid and relatively inobtrusive text ads rather than the large banners and popups demanded by many advertisers if those ads lead to a worse user experience on search engines?

Long term, I’m pretty sure that supporting people who game search engines is not a good thing.
The result will be that search engines are less able to reach their promise as an expression of the collective intelligence of the net. However, I’m not (yet) convinced that this is an open and shut case with regard to the ads that appear on our sites.

First off: consider the terms that are being bought: Rome Hotels, Phuket Hotels, Jack Daniels, Cuban Cigars. Not terribly relevant to programmers, but certainly not completely irrelevant. What’s more, if you’re searching for any of these things on Google or Yahoo!, you may not get the same results that you’d get if there were no advertisers trying to improve their search standings, but you will in fact still get meaningful links. (jackdaniels.com is still the top link for Jack Daniels, for instance.) It would also seem relatively easy for Google or Yahoo! to adjust their algorithms to demote sites that merely appear to be travel brokers instead of actual travel destinations if they think their user experience is being damaged.

Second, advertising in general is designed to get people to pay attention to things that they might not otherwise notice. Sometimes ads are effective, and sometimes they aren’t. But we have to recognize that most forms of advertising, and not just this one, almost always detract from the user experience. But they are accepted by most people as a necessary evil because most of us recognize that developing content costs money, and we accept advertising in exchange for free content.

I do recognize that Google’s preferred form of advertising — context-relevant ads via Adwords — is a real advance in making ads useful and targeted. However, at least so far, our experience has been that Adwords revenue will not even remotely make up for the other forms of advertising we carry on our sites. So our alternatives are to: a) convert the sites from advertising to subscription, b) continue to support them via advertising, or c) shut them down.

Simply put, we pay O’Reilly Network contributors for content, and we pay our staff to develop and maintain the sites. The money to pay those people comes from advertisers. Readers get the content for free, and advertisers pay for the chance to get those readers’ attention. It’s expensive to create a quality website with original technology content–many O’Reilly Network competitors have gone by the wayside in the past few years. I can assure you that we’re not merely “a publishing empire trying to bring in a few more bucks,” as one person commenting on Phil’s blog claimed. Offering ad-supported content is not a hugely profitable business, and we’re just as much “someone literally trying to pay a bill” as the small guys who Phil’s commenter gave a free pass to on this issue.

In business and life, however, things are rarely simple, as Phil notes in his comments on “violent ambiguity.” Net-net: I’m uncomfortable with these ads, and have tasked my team with coming up with an alternative as soon as possible. These ads are running under a long-term contract, and we’ll think hard before renewing it. We’ll also ask 3Genius to remove the links to the overtly deceptive ad that I discovered. However, if we were to shut off this type of advertising today, we’d also have to shutter many of the O’Reilly Network sites.

P.S. These ads have been running on O’Reilly Network sites for more than two years. They have not been recently added, as Phil claims. I don’t know whether this is a good thing or a bad thing — that we were among the first to be a target for search engine optimizers, not just recently joining the parade :-) I do know that now that it’s become clear that this type of ad is a long term problem for the health of the net, we’ve got to find a way to wean ourselves from them.