Your Search Activity Predicts Flu Outbreaks

google flu data vs cdc has released Flu Trends, an online reporting tool for flu-related search activity. It’s long been theorized that Google’s search data would be useful to predict epidemics. This is the first time they’ve released a tool like this to the public. As they say on the main page:

We have found a close relationship between how many people search for flu-related topics and how many people actually have flu symptoms. Of course, not every person who searches for “flu” is actually sick, but a pattern emerges when all the flu-related search queries from each state and region are added together. We compared our query counts with data from a surveillance system managed by the U.S. Centers for Disease Control and Prevention (CDC) and discovered that some search queries tend to be popular exactly when flu season is happening. By counting how often we see these search queries, we can estimate how much flu is circulating in various regions of the United States.

This tool comes to us via’s Predict & Prevent initiative. You can download the data for your own analysis.

If you want to check the status other diseases HealthMap, an online “Global Disease Alert Map”. The automated site uses a variety sources including Google News, traveler reports, and official WHO alerts to track diseases across the world. It is another investment

Tools like Flu Tends will work in areas where people have access to the internet or use Google. Though Google is number one in the US, it doesn’t have top status in all countries and will not necessarily have enough data to make meaningful determinations. If Flu Trends proves valuable enough I wonder if other countries’ CDC-equivalents will pressure their top search engines to develop similar tools.

current flu analysis

tags: , ,
  • John P. Speno

    I wonder if we can prevent the flu outbreaks with Vitamin D3 instead of flu shots?

  • peaboy

    I wonder if the timing of these queries also coincide with outside information being pumped about Flu Shots? By that I mean the causal driver for the queries could be emails from your HMODoctorHR and the local news stories about getting your flu shots seekng you to learn more rather than acutal illness.

    Just a thought. Its obviously hard to figure out predictive information from this very open multi-variable system.

    Of course Google is really going to show you a Flu Shot ad in Gmail when your state’s query rate for Flu goes up, that’s what they really want to do. ;-)

  • Josh

    I see the correlation between the lines, but I don’t see any predictive value. It does not appear that the Google Flu Trends line increased before the CDC data line. Of course when people have the flu they will search about it more. So not really seeing the big deal about this, what am I missing?

  • I also don’t see any predictive information, just coincident data. To be useful the search terms would have to lead the CDC’s data by some useful time margin.

  • Carl

    @Josh, @Alex,

    the key point around this is that they do have a 1-2 week predictive ability:

    “So why bother with estimates from aggregated search queries? It turns out that traditional flu surveillance systems take 1-2 weeks to collect and release surveillance data, but Google search queries can be automatically counted very quickly. By making our flu estimates available each day, Google Flu Trends may provide an early-warning system for outbreaks of influenza.”

  • Very creative and novel. I’m looking forward to seeing how this holds up and correlates to verified flu outbreaks. I hope this works because I just love the stuff that Google comes up with! Austin

  • Josh

    Oh I see, that is pretty cool then.


    HMM, What else is google search keeping track of?

  • thomas4

    This only works when the searcher uses the correct “medical semantics” and specific clinical terminology. “Muscle aches” do not equal “flu outbreak.”

    There was actually a task force sponsored by Consumers Union and HHS/Disease Prevention several years ago that studied this, found Google searching was too uncontrolled and random to actually predict an outbreak (see:

  • Alan Hartig

    I came down with the flu4 days ago.

  • A discussion about this was on NPR this morning. The claim was that there was a 2 week lead over the epidemic although the point was made that this data could be compromised fairly easily. When asked how this could help, the CDC spokesman suggested they might put up a banner ad for flu shots…

  • I don’t see anything predictive here. This is causality at work. Search is a result of a causal event – in this case the flu. In my opinion if you were to collate data from doctors offices “real-time” or “near real-time” you’d have a faster response. This can be done when they file insurance claims? Most people make searches after the diagnosis has been made (again guessing) – it’s patients trying to “second-guess” or “reassure themselves” after they have the diagnosis.

  • I write a lot for ehow and I will be paying attention to this article. It looks like getting a flu shot might be good idea this year as we are on track it appears, to be strong as last year. I got the flu which laid me out two weeks. not fun. Thanks for the great article. I am tracking with friendfeed.

  • Of course its causal, but that’s still super valuable given that this is just an information byproduct.

    We’re seeing incredibly immediate, accurate and relevant *medical* data for free, based on a totally unrelated technology (search).

    Makes me wonder how long it will take for more complicated predictors to emerge. We could’ve called the election months ago if we had google’s data and an adequate sociology of search.

  • Given the recent news about swine flu, is it likely that this same tool might help for other strains of flu?