Report from first health care privacy conference

Strange that a conference on health privacy has never been held before, so I’m told. Privacy in health care is the first topic raised whenever someone talks about electronic health records–and dominates the discussion from then on–or, on the other hand, is dismissed as an overblown concern not worthy of criticism. But today a conference was held on the subject, prepared by Patient Privacy Rights and the University of Texas’s Lyndon B. Johnson School of Public Affairs, and held just a few blocks from the Capitol building at the Georgetown Law Center as a preconference to the august Computers, Freedom & Privacy conference.

The Goldilocks dilemma in health privacy

Policy experts seem to fall into three camps regarding health privacy. The privacy maximalists include Patient Privacy Rights, as well as the well-known Electronic Privacy Information Center and a number of world-renowned experts, including Alan Westin, Ross Anderson from Cambridge University, Canadian luminary Stephanie Perrin, and Carnegie Mellon’s indefatigable Latanya Sweeney (who couldn’t attend today but submitted a presentation via video). These people talk of the risks of re-identifying data that was supposed to be identified, and highlight all the points in both current and proposed health systems where intrusions can occur.

On the other side stand a lot of my closest associates in the health care area, who intensely dislike Patient Privacy Rights and accuse it of exaggerations and mistruths. The privacy minimalists assert that current systems provide pretty good protection, that attacks on the average person are unlikely (except from other people in his or her life, which are hard to fight systematically), and that an over-concern for privacy throws sand in the machinery of useful data exchange systems that can fix many of the problems in health care. (See for instance, my blog on last week’s Health Data Initiative Forum)

In between the maximalists lie the many people trying to adapt current systems to the complex needs of modern health care with an eye toward privacy–those who want to get it “just right.” The Direct Project (discussed today the Chief Privacy Officer of the Office of the National Coordinator, Joy Pritts) is an example of these pragmatic approaches.

It so happens that the American public can also be divided into these three camps, as Westin explained in his keynote. Some will go to great lengths to conceal their data and want no secondary uses without their express permission. Others have nothing to hide, and most of us lie in between. It is sobering, though, to hear that Americans in surveys declare that they don’t trust what insurers, employers, and marketers will do with their health data. What’s more disturbing is that Americans don’t trust researchers either. Those who take on the mantle of the brave biological explorer acting in the highest public interest must question why ordinary people question his devotion to their needs.

The dilemma of simplicity: technical solutions may not be implementable

As technologist Wes Rishel pointed out, technical solutions can often be created that solve complex social problems in theory, but prove unfeasible to deploy in practice. This dilemma turns up in two of the solutions often proposed for health privacy: patient consent and data segmentation.

It’s easy to say that no data should be used for any purpose without express consent. For instance, Jessica Rich from the FTC laid out an iron-clad program that a panel came up with for protecting data: systems must have security protections built in, should not collect or store any more data than necessary, and should ensure accuracy. It is understood that sharing may be necessary during treatment, but the data should be discarded when no longer needed. Staff that don’t need to know the data (such as receptionists and billing staff) should not have access. Indeed, Rich challenged the notion of consent, saying it is a good criterion for non-treatment sharing (such as web sites that offer data to patients) but that in treatment settings, certain things should taken as a given.

But piercing the ground with the stake of consent reveals the quicksand below. We don’t even trace all the ways in which data is shared: reports for public health campaigns, billing, research, and so on. Privacy researchers have trouble figuring out where data goes. How can doctors do it, then, and explain it to patients? We are left with the notorious 16-page privacy policies that no one reads.

Most patients don’t want to be bothered every time their data needs to be shared, and sometimes (such as where public health is involved), we don’t want to give them the right to say no. In one break-out session about analytics, some people said that public health officials are too intrusive and that few people would opt out if they were given a choice about whether to share data. But perhaps the people likely to opt out are precisely the ones with the conditions we need to track.

Helen Nissenbaum of NYU suggested replacing the notion of “consent” with one of “appropriateness.” But another speaker said that everyone in the room has a different notion of what is appropriate to share, and when.

The general principle here–found in any security system–is that any technology that’s hard to use will not be used. The same applies to the other widely pushed innovation, segmented data.

The notion behind segmentation is that you may choose to release only a particular type of data–such as to show a school your vaccination record–or to suppress a particular type, such as HIV status or mental health records. Segmentation was a major feature of an influential report by the President’s Council of Advisors on Science and Technology.

Like consent, segmentation turns out to be complex. Who will go throw a checklist of 60 items to decide what to release each time he is referred to a specialist? Furthermore, although it may be unnecessary for a a doctor treating you for a broken leg to know you have a sexually transmitted disease, there may be surprising times when seemingly unrelated data is important. So patients can’t use segmentation well without a lot of education about risks.

And their attempts at segmentation may be undermined in any case. Even if you suppress a diagnosis, some other information–such as a drug you’re taking–may be used to infer that you have the condition.

A certain fatalism sometimes hung over the conference. One speaker went to far as to suggest a “moratorium” on implementing new health record systems until we have figured out the essential outlines of solutions, but even she offered it only as a desperate speculation, knowing that the country needs new systems. And good models for handling data certainly exist.

Here is the strenuous procedure that the Centers for Medicare & Medicaid Services (CMS) engage in when they release data sets. Each set of data (a Public Use File) represents a particular use of CMS payments: inpatient, outpatient, prescription drugs, etc. The procedure, which I heard described at two conferences last week, is as follows:

They choose a random 5% sample of the people who use particular payments. These samples are disjoint, meaning that no person is used in more than one sample. Because they cover tens of millions of individuals, a small sample can still be a huge data set.
They perform standard clean-up, such as fixing obvious errors.
They generalize the data somewhat. A familiar way to release aggregated results in a way that makes it harder to identify people is to provide only the first three digits of a five-digit ZIP code. Other such fudge factors employed by CMS include offering only age ranges instead of exact ages, and rounding payment amounts.
They check certain combinations of fields to make sure these appear in numerous records. If fewer than 11 people share a certain combination of values, they drop these people.
If they had to drop more than 10% of the people in step 4, they go back to step 3 and try increasing the fudge factors. They iterate through steps 3 and 4 until the data is of a satisfactory size.

Clearly, this procedure works only with data sets on a large scale, not with the limited samples provided by many hospitals, particularly for relatively rare diseases.

Avoidable risks and achievable rewards

As Anderson said, large systems with lots of people have leaks. “Some people will be careless and others will be crooked.” As if to illustrate the problem, one of the attendees today told me that Health Information Exchanges could well be on the hook for breaches they can’t prevent. They rely on health providers to release the right data to the right health provider. The HIE doesn’t contact the patient independently. Any mistake is likely to be the doctor’s fault, but the law holds the HIE equally liable. And given a small, rural doctor with few funds, well liked by the public, versus a large corporation, whom do you suppose the patient will sue?

I can’t summarize all the questions raised at today’s conference–which offered one of the most impressive rosters of experts I’ve seen at any one-day affair–but I’ll list some of the challenges identified by a panel on technology.

Use cases to give us concrete material for discussing solutions
Mapping the flows of data, also to inform policy discussions
Data stewardship–is the data in the hands of the patient or the doctor, and who is most trustworthy for each item?
Determining how long data needs to be stored, especially given that ways to crack de-identified data will improve over time
Reducing the fatigue mentioned earlier for consent and segmentation
Identifying different legal jurisdictions and harmonizing their privacy regulations
Identifying secondary levels of information, such as the medication that indirectly reveals the patient’s condition

Rehab

Some of the next steps urged by attendees and speakers at the conference include:

Generating educational materials for the public, for doctors, and for politicians
Making health privacy a topic for the Presidential campaign and other political debates
Offering clinicians guidelines to build privacy into procedures
Seeking some immediate, achievable goals, while also defining a long-term agenda under the recognition that change is hard
Defining a research agenda
Educating state legislatures, which are getting more involved in policy around health care