"data privacy" entries
Doug Cutting on applications of Hadoop, where "Hadoop" comes from, and the new partnership between Cloudera and O'Reilly.
Roger Magoulas, director of market research at O’Reilly and Strata co-chair, recently sat down with Doug Cutting, chief architect at Cloudera, to talk about the new partnership between Cloudera and O’Reilly, and the state of the Hadoop landscape.
Cutting shares interesting applications of Hadoop, several of which had touching human elements. For instance, he tells a story about visiting Children’s Healthcare of Atlanta and discovering the staff using Hadoop to reduce stress in babies. Read more…
Google requires quid for its quo, but it offers something many don’t: user data access.
Despite some misgivings about the company’s product course and service permanence (I was an early and fanatical user of Google Wave), my relationship with Google is one of mutual symbiosis. Its “better mousetrap” approach to products and services, the width and breadth of online, mobile, and behind-the-scenes offerings saves me countless hours every week in exchange for a slice of my private life, laid bare before its algorithms and analyzed for marketing purposes.
I am writing this on a Chromebook by a lake, using Google Docs and images in Google Drive. I found my way here, through the thick underbrush along a long since forgotten former fishmonger’s trail, on Google Maps after Google Now offered me a glimpse of the place as one of the recommended local attractions.
Admittedly, having my documents, my photos, my to-do lists, contacts, and much more on Google, depending on it as a research tool and mail client, map provider and domain host, is scary. And as much as I understand my dependence on Google to carry the potential for problems, the fact remains that none of those dependencies, not one shred of data, and certainly not one iota of my private life, is known to the company without my explicit, active, consent. Read more…
Response to NSA data mining and the troubling lack of technical details, Facebook's Open Compute data center, and local police are growing their own DNA databases.
It’s a question of power, not privacy — and what is the NSA really doing?In the wake of the leaked NSA data-collection programs, the Pew Research Center conducted a national survey to measure American’s response. The survey found that 56% of respondents think NSA’s telephone record tracking program is an acceptable method to investigate terrorism, and 62% said the government’s investigations into possible terrorist threats are more important than personal privacy.
Rebecca J. Rosen at The Atlantic took a look at legal scholar Daniel J. Solove’s argument that we should care about the government’s collection of our data, but not for the reasons one might think — the collection itself, he argues, isn’t as troubling as the fact that they’re holding the data in perpetuity and that we don’t have access to it. Rosen quotes Solove:
“The NSA program involves a massive database of information that individuals cannot access. … This kind of information processing, which forbids people’s knowledge or involvement, resembles in some ways a kind of due process problem. It is a structural problem involving the way people are treated by government institutions. Moreover, it creates a power imbalance between individuals and the government. … This issue is not about whether the information gathered is something people want to hide, but rather about the power and the structure of government.”
Humans as nodes, pills and electronic tattoo password authenticators, NSA surveillance leaks, and hiding data in temporal cloaks.
Collaborative sensor networks of humans, and your body may be the next two-factor authenticator
There has been much coverage recently of the Internet of Things, connecting everything from washers and dryers to thermostats to cars to the Internet. Wearable sensors — things like FitBit and health-care-related sensors that can be printed onto fabric or even onto human skin — are also in the spotlight.
Kevin Fitchard reports at GigaOm that researchers at CEA-Leti and three French universities believe these areas are not mutually exclusive and have launched a project around wireless body area networks called CORMORAN. The group believes that one day soon our bodies will be constantly connected to the Internet via sensors and transmitters that “can be used to form cooperative ad hoc networks that could be used for group indoor navigation, crowd-motion capture, health monitoring on a massive scale and especially collaborative communications,” Fitchard writes. He takes a look at some of the benefits and potential applications of such a collaborative network — location-based services would be able to direct users to proper gates or trains in busy airports and train stations, for instance — and some of the pitfalls, such as potential security and privacy issues. You can read his full report at GigaOm.
In related news, wearable sensors — and even our bodies — may not only be used to connect us to a network, but also to identify us as well. Read more…
Facebook scraping could lead to machine-generated spam so good that it's indistinguishable from legitimate messages.
A recent blog post inquired about the incidence of Facebook-based spear phishing: the author suddenly started receiving email that appeared to be from friends (though it wasn’t posted from their usual email addresses), making the usual kinds of offers and asking him to click on the usual links. He wondered whether this was a phenomenon and how it happened — how does a phisherman get access to your Facebook friends?
The answers are “yes, it happens” and “I don’t know, but it’s going to get worse.” Seriously, my wife’s name has been used in Facebook phishing. A while ago, several of her Facebook friends said that her email account had been hacked. I was suspicious; she only uses Gmail, and hacking Google isn’t easy, particularly with two-factor authentication. So, I asked her friends to send me the offending messages. It was obvious that they hadn’t come from my wife’s account; they were Yahoo accounts with her name but an unrecognizable email address, exactly what this blogger had seen.
How does this happen? How can a phisher discover your name and your Facebook friends? I don’t know, but Facebook is such a morass of weird and conflicting security settings that it’s impossible to know just how private or how public you are. If you’ve ever friended people you don’t know (a practice that remains entirely too common), and if you’ve ever enabled visibility to friends of friends, you have no idea who has access to your conversations.
U.S. opens data, Wong tapped for U.S. chief privacy officer, FBI might read your email sans warrant, and big data spells trouble for anonymity.
U.S. government data to be machine-readable, Nicole Wong may fill new White House chief privacy officer role
The U.S. government took major steps this week to open up government data to the public. U.S. President Obama signed an executive order requiring government data to be made available in machine-readable formats, and the Office of Management and Budget and the Office of Science and Technology Policy released a Open Data Policy memo (PDF) to address the order’s implementation.
The press release announcing the actions notes the benefit the U.S. economy historically has experienced with the release of government data — GPS data, for instance, sparked a flurry of innovation that ultimately contributed “tens of billions of dollars in annual value to the American economy,” according to the release. President Obama noted in a statement that he hopes a similar result will come from this open data order: “Starting today, we’re making even more government data available online, which will help launch even more new startups. And we’re making it easier for people to find the data and use it, so that entrepreneurs can build products and services we haven’t even imagined yet.”
Reuters' Connected China, accessing Pew's datasets, Simon Rogers' move to Twitter, data privacy solutions, and Intel's shift away from chips.
Reuters launches Connected China, Pew instructs on downloading its data, and Twitter gets a data editor
Yue Qiu and Wenxiong Zhang took a look this week at a data journalism effort by Reuters, the Connected China visualization application. Qiu and Zhang report that “[o]ver the course of about 18 months, a dozen bilingual reporters based in Hong Kong dug into government websites, government reports, policy papers, Mainland major publications, English news reporting, academic texts, and think-tank reports to build up the database.”
Intrusiveness of FBI stingrays, IRS vs Fourth Amendment, Liquid Robotics' AWS of open seas, and Republicans want big data.
FBI and IRS push privacy envelope
Details about how the FBI uses stingray or IMSI-catcher technology — and how much more intrusive it is than previously known — have come to light in a tax fraud case against accused identity thief Daniel David Rigmaiden. Kim Zetter reports at Wired that the FBI, in coordination with Verizon Wireless, was able to track Rigmaiden’s location by reprogramming his air card to connect to the FBI’s fake cell tower, or stingray, when calls came to a landline controlled by the FBI. “The FBI calls, which contacted the air card silently in the background, operated as pings to force the air card into revealing its location,” Zetter explains.
The U.S. government claims it doesn’t need a warrant to use stingrays “because they don’t collect the content of phone calls and text messages and operate like pen-registers and trap-and-traces, collecting the equivalent of header information,” Zetter says, but in this particular case they got a probable-cause warrant because the stingray located and accessed the air card remotely through Rigmaiden’s apartment.
The issue at stake in this case is whether or not the court was fully informed as to the intrusiveness of the technology when it granted the warrant. Read more…
Big data and language preservation, growing data privacy concerns, and a comparison of big data to crude oil.
Preserving human language with big data
Inspired by Deb Roy’s 2011 TEDTalk, “The Birth of a Word,” Nataly Kelly at the Huffington Post’s TEDWeekends took a look at the potential effect big data could have on language — specifically, on preserving endangered and dying languages.
Celebrating Data Privacy Day, how data fits into Bill Gates' education plan, and why "long data" deserves our attention.
Data Privacy Day and the fight against “digital feudalism”
Data Privacy Day was celebrated this week. Led by the National Cyber Security Alliance, the day is meant to increase awareness of personal data protection and “to empower people to protect their privacy and control their digital footprint and escalate the protection of privacy and data as everyone’s priority,” according to the website.
Many companies used the day as an opportunity to issue transparency reports, re-informing users and customers about how their data is used and and how it’s protected. Google added a new section to its transparency report, a Q&A on how the company handles personal user data requests from government agencies and courts.