Transforming the Relationship Between Citizens and Government: Making Content Findable Online

Thursday on this blog, Congressman Honda asked, “how can congress take advantage of web 2.0 technologies to transform the relationship between citizens and government?” He noted that “A dramatic shift in perspective is needed before that need can be met. Instead of databases becoming available as a result of Freedom Of Information Act requests, government officials should be required to justify why any public data should not be freely available to the taxpayers who paid for its creation.” He asked for input on what web 2.0 features he should add to his website to take advantage of today’s online world.

The most important feature government web sites can add isn’t really feature at all. But it would absolutely transform the relationship between citizens and government and make an amazing array of public data available. What’s this magic feature?

Make government web sites search engine friendly.

How we look for information
Search is the primary navigation point for the web. Often when citizens look for government information, they start at a major search engine. They don’t think to themselves, I need some information on vitamins, so I’ll just go on over to the Office of Dietary Supplements at http://dietary-supplements.info.nih.gov. And then I need to make sure I’m eating a balanced diet, so I’ll just check out http://www.nutrition.gov from the National Agricultural Library. And before I head to the grocery store, I’ll make sure I understand how nutrition labels work from the information provided by the Center For Food Safety and Applied Nutrition at http://www.cfsan.fda.gov. Mostly, they go to Google and type in [food labels]. And in some cases, this works perfectly and the information appears.

food_labels.jpg

But when information from government web sites doesn’t show up on the first page of results for those searches, the information may as well not exist at all. For instance, an amazing amount of data exists from the U.S. Census Bureau, but it’s inaccessible from search engines because it’s locked behind JavaScript forms and the content itself doesn’t use language that searchers would use. If I search for [98116 census data], results from census.gov are nowhere to be found.

census-search.jpg

Obstacles to being found in search engines
One problem is that the U.S. Census Bureau pages don’t use zip codes to denote regions. They use tract numbers. Even if the pages were written in plain language searchers might use, search engine crawlers couldn’t get past the JavaScript forms to access the pages.

census2.jpg

Try doing a search using the same terminology as the U.S. Census Bureau, and you start to see the problem with the site’s findability. Take [census track 97.02]:

None of those results lead to these handy details:

census4.jpg

In addition to being buried behind JavaScript and containing little language people would actually search for, it’s hidden in a popup with a URL like this: http://factfinder.census.gov/servlet/IdentifyResultServlet?_mapX=281&_mapY=216&_latitude=&_longitude=&_pageX=442&_pageY=554&_dBy=100&_jsessionId=0001cv7n8rWxjslrmI9aRw5nr-V:134a7lbrs”>http://factfinder.census.gov/servlet/IdentifyResultServlet?_mapX=281&_mapY=216&_latitude=&_longitude=&_pageX=442&_pageY=554&_dBy=100&_jsessionId=0001cv7n8rWxjslrmI9aRw5nr-V:134a7lbrs

The server appends a session ID to the end of the URL (the portion beginning with “jessionsId”), which is tied to an individual visitor session and times out after 60 minutes. If I share that URL on a social media site, email, or in this blog post, anyone who tries to visit it just gets a “session as expired” message. It goes without saying that this kind of URL can’t be indexed by search engines no matter how sophisticated they become.

Is government data appearing in search results really all that important?
The data tells us that search engines are part of our every day lives, and when we search for health information, or tax data, or details on the latest legislation, we start with a search, likely on Google.

Consider:

Last year, I spoke at the eMetrics Marketing Optimization Summit in Washington DC. I talked to a number of people who manage government web sites and the overriding concern was that while the government spends considerable resources ensuring their web sites have valuable, accurate information, little time is spent ensuring those web sites could be found in major search engines. That’s like building a brick and mortar store with shiny new marble floors and high-quality, low-priced merchandise and keeping the front door locked.

Search engines and the government are working together… sort of
In 2007, Google estimated that about half of the content government agencies make available online doesn’t appear in search results at all due to how the web sites are constructed. When I was at Google, I worked on a program to help government agencies open their web sites to search engines. In 2007, Google posted details of how this program was helping the state of Arizona make its web sites more visible to search engine users. The Google program site is still live, but no word on if it continues and if government web sites have found continued success with it.

In 2006, Google relaunched “Google Uncle Sam“, which restricts searches to government sites. This product has been around since 1999 in some form, and makes it easier to search only government sites. However, it contains the same subset of government pages as Google’s standard web search. If the pages aren’t accessible to search engine crawlers, searchers can’t find them, regardless of where they search.

USA.gov is the government’s attempt to provide a single interface for searching government web sites. However, don’t expect that you’ll get a fuller set of results here than from the major search engines. the search engine at USA.gov is powered by Microsoft’s Live Search, so if Live Search can’t crawl and index the site for their web index, it won’t appear in results for USA.gov either.

Our health is at stake
A 2008 PEW/Internet study found that:

  • 75% and 80% of online Americans have turned to the internet for medical advice and that most start at a major search engine.
  • 75% of e-patients with a chronic condition say their last health search affected a decision about how to treat an illness or condition.
  • 59% say the information they found online led them to ask a doctor new questions or get a second opinion.

A study on how different generations use the internet found that “older internet users are significantly more likely than younger generations to look online for health information. Health questions drive internet users age 73 and older to the internet just as frequently as they drive Generation Y users, outpacing teens by a significant margin. Researching health information is the third most popular online activity with the most senior age group, after email and online search.”

In 2007, comScore concluded that “One of the key drivers of traffic to the online health information category is health-related search. Specifically, many consumers begin their navigation by first conducting a search using keywords or phrases for specific conditions or ailments.”

While at eMetrics last year, I talked to someone who works on the NIH.gov website (the National Institutes of Health). She told me about the rigorous review process used to vet content that appears on the site. It’s very important to the NIH that the health information they provide is accurate. They devote so many resources to that content. Does it benefit Americans? In 2006, comScore found that only four percent of visitors to nih.gov got there by typing the address. Most of the rest arrived via a search on a major search engine such as Google. That means that if the nih.gov web site doesn’t appear on the first page of results for a health query, Americans are unlikely to ever know that the valuable and carefully vetted information on the site exists.

How can the government make data more accessible?
Why is some government content so hard to find in search engines? A key reason is likely that commercial sites have a monetary interest in appearing in search engines, so they seek out best practices for developing a search-friendly site. Government sites may have content goals, but may not have monetary or traffic goals, so there’s less incentive to make findability a key component of the site architecture.

That has been changing. As we become more of a searching culture, those working on websites, including government-based ones, realize that being found for relevant queries in search engines is vital. Katie Stanton recently left Google to head Obama’s citizen participation efforts and improving the ability of Americans to find the government data they need through the major search engines is key on her list.

The government seems to want to make data more accessible to its citizens. Vivek Kundra, the new Chief Information Officer, plans to create a new web site, data.gov, as a repository for all government data. He said, “There is a lot of data the federal government has and we need to make sure that all the data that is not private, or restricted for national security reasons, can be made public.” It’s vital that any plan to make data public includes making it findable in search engines.

Certainly, I’m not the first person to call for better findability for government web sites. In 2007, a bill called for the “Office of Management and Budget to create guidance and best practices for federal agencies to make their websites more accessible to search engine crawlers, and thus to citizens who rely on search engines to access information provided by their government”. However, this bill was never passed, and I’m not sure if the directive made it into another bill.

A 2005 OMB memorandum already covers some of this, noting “when disseminating information to the public-at-large, publish your information directly to the Internet. This procedure exposes information to freely available and other search functions and adequately organizes and categorizes your information.”

OMB Policy 5: Search Public Websites requires government web sites to provide search functionality within the sites to enable citizens easy access to the content. But to truly make content available, it has to appear on the first page of search results for relevant queries on major search engines.

What can government web sites do to be more findable? Findability is an in-depth process that involves both technical site architecture (ensuring search engines can crawl the pages) and content (ensuring the text is in the language of the searcher). As Kundra acknowledged, ““A two-way interaction between the government and its citizens,” he said, “will require a massive transformation by the government, on the back end, to ensure the government can deal with this new reality.”. But while a long-term plan to make government content findable in search engines is necessary, there are some basic steps that government sites can take to make content more visible in the short term.

To be fair, many government sites are doing a fairly good job at architecting their content, particularly related to accessibility. And they’ve even got a search engine optimization 101 page on their website management portal (although, unfortunately, it hasn’t been updated since March 2007 and provides resources written in 2003, so updating that might be a good place to start). There’s interest, but so far they’ve just scratched the surface. They need a comprehensive plan and set of best practices that are integral parts of the web development process.

Check back later this week as I dive into some of the details of how they can set up short term and long improvements. The first step? Understanding that making government sites search engine-friendly is key to improved transparency, increased public data accessibility, and and a “web 2.0” relationship between citizens and government that brings positive change.

tags: , , ,