SEO: Black vs. White

There are a lot of debates within the search community over what is black vs. white search engine optimization. Where does content optimization end and webspam begin. Tim Converse of Yahoo! Search has a good guide to the varying levels of light and dark when it comes to SEO. Google Blogoscoped has done a good job of turning Tim’s lengthier descriptions into smaller chunks. Here I’ve listed out some of the more interesting descriptions next to each other.

Dark gray: The SEO collects (aka steals) random text from other sites, and uses it to create thousands (or millions) of pages targeting particular queries. The pages have nothing original of value, but do have ads.
Dark Gray Hat SEO: This SEO is e.g. a splogger stealing content from other sites. (What, that’s better than charcoal?)

Light gray: The SEO creates “original” content in bulk the old-fashioned way, thinking first of all of search engine rules, secondly of duplicate detection algorithms, and lastly of whether the text makes sense to human beings and is something anyone would ever want to read. Then the SEO experiments with all the parameters (keyword density, internal linkage) trying to move up for the queries of interest.
Light Gray Hat SEO: This SEO creates original content (lots of it), but the content is still only aimed at search engines.

Luminescent pearly white: This would be a case where the SEO designs a site to show up for relevant queries and not to show up for irrelevant queries. Do luminescent SEOs exist? Well, Jon Udell is one anyway.
Luminescent Pearly White Hat SEO: Not only does this SEO do everything the White Hat SEO does, the LPW Hat SEO also makes sure pages will not show up for irrelevant queries.

  • There is absolutely no way to “make sure pages will not show up for irrelevant queries.” It’s impossible. The words in a page can be combined in an amazing variety of ways, many of which have nothing to do with the focus of the page, what IR researchers call the “aboutness”.

    Back in the day, librarians, indexers and abstracters worked hard at crafting really fabulous descriptors for documents. Even they didn’t agree on what each document was about — plenty of research on that. And even back then, we got false drops because language and vocabulary are so nebulous, dependent on context and personal preferences.

    It’s even worse as language changes – ATMs and AIDS and wired and cell all mean new things, but there’s no way to tell. So any old page may have old meanings, searchers use new meanings, and though the words match, it’s an irrelevant page.