Predictive analytics and data sharing raise civil liberties concerns

Expanded rules for data sharing in the U.S. government will need more oversight as predictive algorithms are applied.

Last winter, around the same time there was a huge row in Congress over the Cyber Intelligence Sharing and Protection Act (CISPA), U.S. Attorney General Holder quietly signed off on expanded rules on government data sharing. The rules allowed the National Counterterrorism Center (NCTC), housed within the Department of Homeland Security, to analyze the regulatory data collected during the business of government for patterns relevant to domestic terrorist threats.

Julia Angwin, who reported the story for the Wall Street Journal, highlighted the key tension: the rules allow the NCTC to “examine the government files of U.S. citizens for possible criminal behavior, even if there is no reason to suspect them.” 

On the one hand, this is a natural application of big data: search existing government records collected about citizens for suspicious patterns of behavior. The action can be justified for counter-terrorism purposes: there are advanced persistent threats. (When national security is invoked, privacy concerns are often deprecated.) The failure to "connect the dots" using existing data across government on Christmas Day 2009 (remember the so-called "underwear bomber?") added impetus to getting more data in the NCTC’s hands. It’s possible that the rules on data retention were extended five years because the agency didn’t have the capabilities it needed. Data mining existing records offers unprecedented opportunities to find and detect terrorism plots before they happen.

On the other hand, the changes at the NCTC that were authorized back in March 2012 represent a massive data grab with far-reaching consequences. The changes received little public discussion prior to the WSJ breaking the story, and they seem to substantially override the purpose of the Federal Privacy Act that Congress passed in 1974. Extension of the rules happened without public debate because of what effectively amounts to a legal loophole. Post proposed changes to the Federal Register, voila. Effectively, this looks like an end run around the Federal Privacy Act.

Here’s the rub: according to Angwin, DoJ Chief Privacy Officer Nancy Libin:

“… raised concerns about whether the guidelines could unfairly target innocent people, these people said. Some research suggests that, statistically speaking, there are too few terror attacks for predictive patterns to emerge. The risk, then, is that innocent behavior gets misunderstood — say, a man buying chemicals (for a child’s science fair) and a timer (for the sprinkler) sets off false alarms. An August government report indicates that, as of last year, NCTC wasn’t doing predictive pattern-matching.”

It’s hard to say whether predictive data analytics are now in use at NCTC. It would be surprising if there isn’t pressure to experiment, given the expansion of “predictive policing” in cities around the U.S.. There stand to be significant, long-lasting repercussions if the center builds capacity to apply that capability at large scale without great care and informed Congressional oversight.

One outcome is a dystopian scenario straight out of science fiction, from “thoughtcrime” to presumptions of guilt. Alistair Croll highlighted some of the associated issues involved with big data and civil rights last year.

As Angwin pointed out, the likelihood of a terrorist attack in the U.S. remains low as compared to other risks Americans face every day from traffic, bees or lifestyle decisions. After 9/11, however, public officials and Congress have had little risk tolerance. As a result, vast, expensive intelligence and surveillance infrastructure in the U.S. has been massively expanded, with limited oversight and very little accountability, as documented in “Top Secret America.”

When intelligence officials have gone public to the press as whistle-blowers regarding overspending, they have been prosecuted. Former National Security Agency staffer Thomas Drake spoke at the 2011 Web 2.0 Summit about his experience. We talked about it in a subsequent interview, below:

The new rules have been in place now for months, with little public comment upon the changes. (Even after it was relaunched, the nation doesn’t seem to be reading the Federal Register. These days, I’m not sure how many members of the DC media do, either.) I’m unsure whether it’s fair to blame the press, though I do wonder how media resources were allocated during the “horse race” of the presidential campaigns last year. Now, the public is left to hope that the government oversees itself effectively behind closed doors.

I would find a recent “commitment to privacy and civil liberties” by the Department of Homeland Security more convincing if the agency wasn’t confiscating and searching electronic devices at the border without a warrant. 

Does anyone think that the privacy officers whose objections were overruled in the internal debates will provide the effective counterweight protecting the Bill of Rights will require in the years to come?

tags: , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.

  • Still_A_Logician

    I’ve always assumed that the NSA has been doing this kind of data mining for years. Why else would they need such a huge staff, why would they recruit data miners, and why would they assemble so much computing power? (Well, I guess the computing power could be for code-breaking.)

    • digiphile

      I don’t think you’re alone there. Computing power for code breaking makes a lot of sense — and as encryption gets better, more of it is needed. One thing that struck me as notable here, though, is that the NTCT is working with this data, not the NSA.