• Print

Making Site Architecture Search-Friendly: Lessons From whitehouse.gov

Guest blogger Vanessa Fox is co-chair of the new O’Reilly conference Found: Search Acquisition and Architecture. Find more from Vanessa at ninebyblue.com and janeandrobot.com. Vanessa is also entrepreneur in residence at Ignition Partners, and Features Editor at Search Engine Land.

Yesterday, as President-elect Obama became president Obama, we geeky types filled the web with chatter about change. That change of change.gov becoming whitehouse.gov, that is. The new whitehouse.gov robots.txt file opens everything up to search engines while the previous one had 2400 lines! The site has a blog! The fonts are Mac-friendly! That Obama administration sure is online savvy.

Or is it?

An amazing amount of customer acquisition can come from search (a 2007 Jupiter research study found that 92% of online Americans search monthly and over half search daily). Whitehouse.gov likely doesn’t need the kind of visibility that most sites need in search, but when people search for information about today’s issues, such as the economy, the Obama administration surely wants the whitehouse.gov pages that explain their position to show up.

The site has a blog, which is awesome, but the title tag, the most important tag on the page, has only the text “blog”. Nothing else. Which might help the page rank well for people doing a search for blog, but that’s probably not what they’re going for. This doesn’t just hurt them in search of course. It’s also what shows up in the browser tab and bookmarks.

The site runs on IIS 6.0. Does the site developer know about tricky configuration that makes the redirects search engine-friendly?

Search engines are text-based, so they can’t read text hidden in images. Some whitehouse.gov pages get around this issue well, by making the text look image-like, but leaving it as text, such as below.

whitehouse.gov text example

However, other pages have text in images and don’t use ALT text to describe them. (This, of course, is an accessibility issue as well, as it keeps screen readers from being able to access the text in the images.) An example of this is the home page, which may be part of why whitehouse.gov doesn’t show up on the first page in a search for President Obama.

whitehouse.gov image example

There are all kinds of technical issues, big and small, that impact whether your site can be found in search results for what you want to be found for. (whitehouse.gov using underscores rather than dashes in URLs, the meta descriptions are the same on every page…) Probably the biggest issue in this case is the lack of 301 redirects between the old site and the new site. When you change domains and move content to the new domain, you don’t want to have to rebuild the audience and links all over again. (Not that Obama or whitehouse.gov will have a problem with attracting and audience, but we all can’t be president!) When you use a 301 redirect, both visitors and search engines know to replace the old page with the new one.

In the case of change.gov, it’s unclear if they intend to maintain the old site. The home page asks people to join them at whitehouse.gov, but all the old pages still exist (even the old home page at http://change.gov/content/home).

change.gov example

And in many cases, the same content exists at both change.gov and whitehouse.org (see, for instance, http://change.gov/agenda/iraq_agenda/ and http://www.whitehouse.gov/agenda/iraq/).

As Matt Cutts, Googler extraordinaire pointed out, give them a few days to relax before worrying so much about SEO. And I certainly think the site is an excellent step towards better communication between the president and the American people. But not everyone has the luxury of having one of the most well-known names and sites in the world, so the technical details are more important for the rest of us.

If you want to know more about technical issues that can keep your site from being found in search and tips for making sure that you don’t lose visibility in a site move, join us for the O’Reilly Found conference June 9-11 in Burlingame. And if you’re in Mountain View tomorrow night (Thursday, January 22nd), stop by Ooyala from 6pm to 9pm for our webdev/seo meetup, and get all your search questions answered. Hope to see you there! (Macon Phillips and the whitehouse.gov webmasters are welcome, but my guess is that they’re a little busy.)

tags: , , , ,
  • http://galaxyspectrum.com/ AD PR

    One possibility is that the postings will have so many backlinks and so heavily referenced in top blogs, social bookmarks and other Web 2.0 outlets (including online newspapers) – that SEO and rankings will come naturally WITHOUT optimization.

    The overall results will be quite powerful in search engines and their universal SERPs

  • http://www.unofficialmac.com Alpay

    Wow, the level of visibility the web gives us is almost unbelievable even for me. and of course the number of eyes and the level of scrutiny…

    Very well done…
    Alpay

  • Olaf Grandson

    Pretty changes, but Medvedev in Russia has his own video blog since may inauguration.)

  • Kin

    Even the robots.txt change wasn’t what its been hyped up to be online! They might have removed over 2000 entries, but they didn’t open up anything except duplicate content.

  • http://www.riseinteractive.com Rise Interactive

    Good post. I’ve heard relentlessly about the removal of the robot.txt from the site via blogs and numerous tweets. Yet no one as stated why they were there in the first place. What was their original intent besides the obvious, “Hey Google, don’t index these pages.”.

    Thanks,

    Rise

  • http://www.ninebyblue.com Vanessa Fox

    AD PR, absolutely in this case, whitehouse.gov has a level of visibility and ability to attract links that helps it rise above some of the technical site issues. Most of us don’t have that same visibility, of course, so it makes sense to build technical site architecture in a search-friendly way.

    I do think that even with the high visibility/links that whitehouse.gov has, it will still have trouble ranking for some particular queries with the keywords missing from the title tag and locked in images (unless there’s a lot of anchor text from external links for those keywords).

    Kin, Rise Interactive, While I think it’s great that they’re opening up robots.txt, I agree that there’s more to it. There are lots of good reasons for using robots.txt beyond trying to keep content from search engines (as you mention, duplicate content, as well as keeping search results from being indexed, etc.)

  • http://webtechman.com Daniel Hudson

    Hello Vanessa,

    Thanks for putting this Search Engine Optimization (SEO) for WhiteHouse.gov article together.

    I’m seeing more Social Media efforts in the Government Sector and I like it. It’s interesting to see how the Government is re-branding itself and more importantly, how people are interacting with Government 2.0.

    I work in the world of Enterprise 2.0 using Social Media in the business environment and SEO techniques count there too. We have incorporated a solid SEO strategy and are now hearing less & less about how people can’t find information they need. I am a big fan of your work and can testify how SEO needs to be part of the Enterprise 2.0 Strategy. You can have some of the greatest minds sharing the most intelligent information, but if no one can find it, then what’s the point?

    You make a good point about how some sites don’t really need massive visibility in search engines. I would like to hear more about your thoughts on SEO in the Enterprise. Why is it important to have a solid SEO Strategy for internal platforms?

    Thank you,
    Daniel Hudson
    Enterprise 2.0 with Web 2.0 Web Strategy

  • http://www.100dollarseo.com Carlos del Rio

    I think the biggest issue with whitehouse.gov is that the page design is huge, 500k compressed, 45% of US home don’t have broadband. This means the web-savvyness leaves many people waiting a full minute at home for the site to load.

    If they really want to be accessible they should redesign to be reasonable for 56k users to navigate. After all wasn’t the whole point to increase communication?

  • Phaedrus

    “That Obama administration sure is online savvy.
    Or is it?”

    Of course it is. The mistakes or negligence of developers should not be seen as mistakes/negligence of the administration and certainly not of Mr. Obama. He or his New Media secretary will not be testing the site to see if it is accessible or not. There are bigger problems to worry about.

    Having said that the real problem seems not to be the intention but the implementation and the process of implementation. This is where the devil lies and it will serve the New Media secretary to set up robust quality check processes so as not to end up in an embarrassing situation.

  • http://blog.searchmarketme.com Jenny Dibble

    Vanessa great article on the architectural issues of the new whitehouse.gov site. You made some great points that while yes, they do have some issues they will probably be OK because of the high level of visibility. However, I would hope Obama’s team began working on this site months ago (exactly 1.5 to be exact) and would have their stuff more ‘together’ than this…

    I wrote a blog post about the marketing aspects of the new whitehouse.gov site that were lacking (such as the inability to sign-up for email updates…) on my Small Business – Big Marketing blog. After all, I’d love to consultant the new Obama administration to fix the issues… (are you listening, Obama???) :)

  • http://muabanperfume.com/main nuoc hoa

    thank you for sharing