Strata Week: Why we should care about what the NSA may or may not be doing

Response to NSA data mining and the troubling lack of technical details, Facebook's Open Compute data center, and local police are growing their own DNA databases.

It’s a question of power, not privacy — and what is the NSA really doing?

PEW graph

Pew Research Center national survey

In the wake of the leaked NSA data-collection programs, the Pew Research Center conducted a national survey to measure American’s response. The survey found that 56% of respondents think NSA’s telephone record tracking program is an acceptable method to investigate terrorism, and 62% said the government’s investigations into possible terrorist threats are more important than personal privacy.

Rebecca J. Rosen at The Atlantic took a look at legal scholar Daniel J. Solove’s argument that we should care about the government’s collection of our data, but not for the reasons one might think — the collection itself, he argues, isn’t as troubling as the fact that they’re holding the data in perpetuity and that we don’t have access to it. Rosen quotes Solove:

“The NSA program involves a massive database of information that individuals cannot access. … This kind of information processing, which forbids people’s knowledge or involvement, resembles in some ways a kind of due process problem. It is a structural problem involving the way people are treated by government institutions. Moreover, it creates a power imbalance between individuals and the government. … This issue is not about whether the information gathered is something people want to hide, but rather about the power and the structure of government.”

In a similar vein, Moxie Marlinspike at Wired tackled the response from some that the NSA’s data gathering efforts aren’t worrisome if individuals have nothing to hide. He quotes from <a href="“>James Duane, a professor at Regent Law School and former defense attorney, and from Supreme Court Justice Breyer to make the point that federal criminal laws span 50 titles of the United States Code in 27,000 pages and that the actual total number of laws and regulations isn’t exactly known, creating the problem of knowing what you may or may not need to hide. Marlinspike writes:

“If the federal government had access to every email you’ve ever written and every phone call you’ve ever made, it’s almost certain that they could find something you’ve done which violates a provision in the 27,000 pages of federal statues or 10,000 administrative regulations. You probably do have something to hide, you just don’t know it yet.”

Marlinspike also argues that having something to hide is an important part of our growth as a society—the recent legal victories in same-sex marriage and legalization of marijuana “would probably not have been possible without the ability to break the law,” he writes. Furthermore, he argues, a dystopian world where law enforcement is 100% effective opens wide avenues for abuse of power: “…if everyone’s every action were being monitored, and everyone technically violates some obscure law at some time, then punishment becomes purely selective,” Marlinspike writes. “Those in power will essentially have what they need to punish anyone they’d like, whenever they choose, as if there were no rules at all.” You can read his full piece at Wired.

Mark Jaquith writes on Medium that based on news reports of the PRISM program, we don’t actually know what the NSA is doing or how they’re gathering data, making it impossible for citizens to judge whether or not to be outraged. Jaquith notes the troubling lack of technical details in Glenn Greenwald and Ewan MacAskill’s account of the PRISM program at The Guardian and their follow-up article on Edward Snowden, who leaked the story. Jaquith points out the the authors — not Snowden — described the program as “[allowing] the agency to directly and unilaterally seize the communications off the companies’ servers.” Likewise, he notes, the Washington Post reported that “[f]rom inside a company’s data stream the NSA is capable of pulling out anything it likes.”

All of which, if true, calls for outrage. But on the other side, a report at The New York Times tells a slightly different story. Jaquith quotes from the piece:

“But instead of adding a back door to their servers, the companies were essentially asked to erect a locked mailbox and give the government the key, people briefed on the negotiations said. … The data shared in these ways, the people said, is shared after company lawyers have reviewed the FISA request according to company practice. It is not sent automatically or in bulk, and the government does not have full access to company servers.”

Jaquith points out this is “indirect and moderated” access — in direct opposition to the access Greenwald and MacAskill described as “direct and unilateral.” “The difference between these two explanations isn’t some nuanced distinction that only tech geeks should care about,” Jaquith says. “This is the difference between companies voluntarily giving the government direct and unilateral access to arbitrary customer data and companies merely complying with the law in a technically efficient way that doesn’t change the nature of the data received by the government.” He notes that the only way Greenwald and MacAskill can be correct at this point, without offering further corroborating evidence from Snowden, is if everyone — all the companies involved, the sources for The New York Times, the NSA, and the U.S. President — are lying, which he allows isn’t impossible, but stresses that the technical details do matter. “There is no aspect of this story more important,” he says, “than finding out which account is accurate.” You can read his full report at Medium.

Facebook opens an Open Compute data center

Facebook opened its first data center in Europe this week in Luleå, Sweden, housed only with its Open Compute servers. In a post announcing the launch, the company described the data center as “likely to be one of the most efficient and sustainable” centers in the world and explained that the equipment is powered by 100% renewable, locally generated hydro-electric power — power so reliable, they’ve been able to “reduce the number of backup generators required at the site by more than 70 percent.”

Jon Brodkin notes at Ars Technica that the data center’s power usage effectiveness (PUE) rating is an impressive 1.07 and that Facebook plans to post near real-time PUE data for this center, as it does for its US data centers. Brodkin reports that Facebook’s next goal is to provide companies with an alternative to Cisco and other network vendors by releasing an Open Compute design for a top-of-rack switch that will work with any networking software.

Local police now “databanking” DNA

In a post at The New York Times, Joseph Goldstein took a look at the growing DNA-gathering practices of local law enforcement agencies. Instead of waiting for state and federal agencies, local agencies are developing their own DNA databases, and their methods are causing some concern. Goldstein writes:

“These local databases operate under their own rules, providing the police much more leeway than state and federal regulations. And the police sometimes collect samples from far more than those convicted of or arrested for serious offenses — in some cases, innocent victims of crimes who do not necessarily realize their DNA will be saved for future searches.”

Barry Scheck, a co-director of the Innocence Project, told Goldstein that they’ve warned local law enforcement that the public would be “disturbed” when these “rogue, unregulated” databases came to light. Goldstein reports that DNA samples are being taken from people “on the mere suspicion of a crime” and entered into a database regardless of whether or not the subject was charged or found guilty. Samples also are gathered from people to rule them out of a crime — say, a homeowner burglary victim — but then kept on file. Goldstein notes that the Supreme Court’s recent decision in Maryland v. King was the first to address this sort of DNA “databanking” and that it could serve to accelerate the practice. “While that decision said nothing explicit about the authority of local law enforcement to keep DNA databases,” Goldstein reports, “it could well encourage local jurisdictions to push ahead, several experts said.”

Tip us off

News tips and suggestions are always welcome, so please send them along.

Related:

O’Reilly Strata Conference — Strata brings together the leading minds in data science and big data — decision makers and practitioners driving the future of their businesses and technologies. Get the skills, tools, and strategies you need to make data work.

Strata Rx Health Data Conference: September 25-27 | Boston, MA
Strata + Hadoop World: October 28-30 | New York, NY
Strata in London: November 15-17 | London, England

tags: , , , , ,