Data in use from public health to personal fitness

HHS leadership should cause other organizations to open data.

Back in 2010, the first health data initiative forum by the Dept. of Health and Human Services introduced the public to the idea of an agency releasing internal data in forms easy for both casual viewers and programmers to use. The third such forum, which took place last week in Washington, DC, was so enormous (1,400 participants) that it had to be held in a major convention center. Todd Park, who as CTO made HHS a leader in the open data movement, has moved up to take a corresponding role for the entire federal government. Open data is a world movement, and the developer challenges that the HDI forum likes to highlight are standard strategies for linking governments with app programmers.

Todd Park on main stage
Todd Park on main stage.

Following my attendance at a privacy access summit the previous day, the HDI forum made me think of a government bent on reform and an open-minded public crossing hands over the heads of the hidebound health institutions that blunder onward without the benefits of tapping their own data. I am not tossing all hospitals, doctors, and clinics into this category (in fact, I am constantly talking to institutions who work with available data to improve care), but recording and storage of information in health care generally retards anyone interested in change.

The “datapalooza” was already covered on Radar by Alex Howard, so here I’ll list some of the observations I made during the parts I attended.

Health and Human Services chooses torrents over leaks

Able to attend the forum only on the first day, I spent a lot of it in a session on HHS data sets at Healthdata.gov because I wanted to know exactly what the department has to offer and how the data is being used.

HHS staff at break-out session
HHS staff at break-out session.

Several things impressed me about the procession of HHS staff that crossed the stage to give five- or ten-minute presentations on data sets. First was the ethos of data sharing that the department heads have instilled. Each staff person showed visible pride in finding data that could be put on the Web. A bit of competitive spirit drives different departments that may have more or fewer resources, and data that comes naturally in a more structured or less structured form. One person, for instance, said, “We’re a small division and don’t have the resources of the others, but we managed to release several data sets this year and one has an API.”

Second, the department is devoting resources to quality. I’ve heard several complaints in the field about lack of consistency and other problems in public health data. One could hardly avoid such issues when data is being collected from hundreds of agencies scattered across the country. But the people I talked to at the HHS forum had ways of dealing with it, such as by requiring the researchers who collect data to submit it (so that trained professionals do the data entry), and running it through quality checks to look for anomalies.

Third, the department knows that outside developers coming to their site will need extra help understanding the data being collected: what the samples represent, what the scope of collection was, and so forth. In addition to a catalog powered by a Solr search engine, HHS provides direct guidance to the perplexed for those developing apps. They are also adding Linked Data elements to help developers combine data sets.

A few examples of data sets include:

  • The Center for Medicare & Medicaid Services offers aggregate data on emergency visits, hospital readmission rates (a major source of waste in health costs), and performance measurement.

  • The Administration for Children and Families has a Head Start locator that helps parents find services, aggregate data on people who apply for Low Income Home Energy Assistance, etc.

  • The Agency for Healthcare Research and Quality has longitudinal data abut spending on health care and its effect on outcomes, based on an annual survey, plus a service offering statistics on hospital treatments, morbidity, etc.

  • The Assistant Secretary for Planning and Evaluation tracks workforce development, particularly in health IT, and measures the affordability of health care reflected in costs to employers, patients, and the government.

Recently, HHS has intensified its efforts by creating a simple Web interface where its staff can enter data about new data sets. Data can be uploaded automatically from spreadsheets. And a new Data Access and Use Committee identifies data sets to release.

So now we have public health aids like the Community Indicators Data Portal, which maps the use of Medicaid services to poverty indicators, infant mortality, etc.

HealthMap, created by Children’s Hospital Boston, is used by a fascinating range of projects. They scoop in huge amounts of data–mostly from news sites, but also blogs, and social networks–in multiple languages around the world, and apply a Bayesian filter to determine what’s a possible report of a recent disease outbreak. After a successful flu-tracking program based on accepting reports from the public, they did a dengue-tracking program and, in Haiti, a cholera-tracking program.

But valuable as HHS data is to public health, most of it is not very sexy to the ordinary patient or consumer. If you’re curious how your Medicare charges compare with average payments for your county, go ahead and mine the data. But what about something immediately practical, such as finding the best hospital for a procedure?

Recently, it turns out, HHS has been collecting and releasing data on that level, such as comparative information on the quality of care at hospitals. So a datapalooza like the HDI forum really takes on everyday significance. HHS also provides the Healthcare.gov site, with services such finding insurance plans for individuals and small groups.

Other jurisdictions are joining the health data movement. Many countries have more centralized systems and therefore can release large amounts of data about public health. The United Kingdom’s National Health Service was featured at the HDI forum, where they boasted of posting 3,000 health indicators to their web site.

The state of Louisiana showed off a cornucopia of data, ranging from user restaurant ratings to ratings of oyster beds. Pregnancy risk factors, morbidity rates, etc. are broken down by race, sex, and other demographics. The representative freely admitted that the state has big health problems, and urgently called on developers to help it mine its data. The state recently held a “Cajun codefest” to kick off its effort. HHS also announced five upcoming local datapaloozas in other states around the U.S.

I talked to Sunnie Southern, a cofounder of a Cincinnati incubator called Innov8 for Health. They offer not only challenges for new apps, but guidance to help developers turn the apps into sustainable businesses. The organization also signs up local hospitals and other institutional users to guarantee a market to app developers. Southern describes Innov8 for Health as a community-wide initiative to support local developers and attract new ones, while maintaining deep roots among multiple stakeholders across the health care, university, startup, investors, and employer stake holders. At the inaugural class, which just took place, eight companies were chosen to receive intensive mentoring, introductions and connections to potential customers and investors, and $20,000 to start their company in 12 weeks. Health data is a core element.

How far can a datapalooza take the health care field?

Health apps are a fast-growing segment of mobile development, and the government can certainly take some of the credit, along with VC and developer recognition that there’s a lot of potential money to be made fixing health care. As Todd Park said, “The health innovation ecosystem is beautifully chaotic, self-propelled, and basically out of control.” That means the toothpaste can’t be put back in the tube, which is a good thing.

The HDI forum is glitzy and exciting–everybody in health care reform shows up, and the stage show is slickly coordinated–but we must remember the limits of apps in bringing about systemic change. It’s great that you can use myDrugCo$ts.com to find a discount drug store near you. Even better, if your employer hooks you up to data sets provided by your insurer, myDrugCo$ts.com can warn you about restrictions that affect costs. But none of this will change the crazy pricing in the insurance plans themselves, or the overuse of drugs in medicine, or the inefficient development and testing methods that lead to high medication prices in the first place.

Caucus of Society for Participatory Medicine and friends
Caucus of Society for Participatory Medicine and friends.

Transparency by one department on one level can lead to expectations of transparency in other places too. As pricing in health care becomes more visible, it will become less defensible. But this requires a public movement. We could do great things if we could unlock the data collected by each hospital and insurance agency, but they see that data as their competitive arsenal and we are left with a tragedy of the anti-commons. It would be nice to say, “You use plenty of public data to aid your decision-making, now reciprocate with some of your own.” This can be a campaign for reformers such as the Society for Participatory Medicine.

At the HDI forum, United Healthcare reported that they had enough data to profile patients at risk for diabetes and brought them in for a diabetes prevention program. This is only a sample of what can be done with data that is not yet public.

Aetna presenter shows CarePass on the main conference stage al at health care conference
Aetna presenter shows CarePass on the main conference stage.

Aetna is leading the way with a service called CarePass, currently holding a developer challenge. CarePass offers Aetna’s data through an API, and they partner with other major data centers (somewhat as Microsoft does with HealthVault) to hook up data. Practice Fusion is also offering some data to researchers.

Even those bright-faced entrepreneurs launching businesses around data from HHS and elsewhere–certainly their success is one of the goals of the open data movement, but I worry that they will recreate the silos of the health care field in the area of patient data. What are they collecting on us as we obsessively enter our personal statistics into those devices? Who will be able to use the aggregate data building up on their servers?

So there are hints of a qualitative change that can come from quantitative growth in the release and reuse of health care data. The next step involves the use of personal data, which raises its own litany of issues in quality and privacy. That will be the subject of the last posting in this series.

tags: , , , , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.