"privacy" entries

Four short links: 8 July 2015

Four short links: 8 July 2015

Encrypted Databases, Product Management, Patenting Machine Learning, and Programming Ethics

  1. Zero Knowledge and Homomorphic Encryption (ZDNet) — coverage of a few startups working on providing databases that don’t need to decrypt the data they store and retrieve.
  2. How Not to Suck at Making ProductsNever confuse “category you’re in” with the “value you deliver.” Customers only care about the latter.
  3. Google Patenting Machine Learning Developments (Reddit) — I am afraid that Google has just started an arms race, which could do significant damage to academic research in machine learning. Now it’s likely that other companies using machine learning will rush to patent every research idea that was developed in part by their employees. We have all been in a prisoner’s dilemma situation, and Google just defected. Now researchers will guard their ideas much more combatively, given that it’s now fair game to patent these ideas, and big money is at stake.
  4. Machine Ethics (Nature) — machine learning ethics versus rule-driven ethics. Logic is the ideal choice for encoding machine ethics, argues Luís Moniz Pereira, a computer scientist at the Nova Laboratory for Computer Science and Informatics in Lisbon. “Logic is how we reason and come up with our ethical choices,” he says. I disagree with his premises.
Comment

“Internet of Things” is a temporary term

The O'Reilly Radar Podcast: Pilgrim Beart on the scale, challenges, and opportunities of the IoT.

Hills_album_public_domain_Internet_Archive_Flickr

Subscribe to the O’Reilly Radar Podcast to track the technologies and people that will shape our world in the years to come.

In this week’s Radar Podcast, O’Reilly’s Mary Treseler chatted with Pilgrim Beart about co-founding his company, AlertMe, and about why the scale of the Internet of Things creates as many challenges as it does opportunities. He also talked about the “gnarly problems” emerging from consumer wants and behaviors.

Read more…

Comment: 1
Four short links: 24 June 2015

Four short links: 24 June 2015

Big Data Architecture, Leaving the UK, GPU-powered Queries, and Gongkai in the West

  1. 100 Big Data Architecture Papers (Anil Madan) — you’ll either find them fascinating essential reading … or a stellar cure for insomnia.
  2. Software Companies Leaving UK Because of Government’s Surveillance Plans (Ars Technica) — to Amsterdam, to NYC, and to TBD.
  3. MapD: Massive Throughput Database Queries with LLVM and GPUs (nvidia) — The most powerful GPU currently available is the NVIDIA Tesla K80 Accelerator, with up to 8.74 teraflops of compute performance and nearly 500 GB/sec of memory bandwidth. By supporting up to eight of these cards per server, we see orders-of-magnitude better performance on standard data analytics tasks, enabling a user to visually filter and aggregate billions of rows in tens of milliseconds, all without indexing.
  4. Why It’s Often Easier to Innovate in China than the US (Bunnie Huang) — We did some research into the legal frameworks and challenges around absorbing gongkai IP into the Western ecosystem, and we believe we’ve found a path to repatriate some of the IP from gongkai into proper open source.
Comment
Four short links: 9 June 2015

Four short links: 9 June 2015

Parallelising Without Coordination, AR/VR IxD, Medical Insecurity, and Online Privacy Lies

  1. The Declarative Imperative (Morning Paper) — on Dataflow. …a large class of recursive programs – all of basic Datalog – can be parallelized without any need for coordination. As a side note, this insight appears to have eluded the MapReduce community, where join is necessarily a blocking operator.
  2. Consensual Reality (Alistair Croll) — Among other things we discussed what Inbar calls his three rules for augmented reality design: 1. The content you see has to emerge from the real world and relate to it. 2. Should not distract you from the real world; must add to it. 3. Don’t use it when you don’t need it. If a film is better on the TV watch the TV.
  3. X-Rays Behaving BadlyAccording to the report, medical devices – in particular so-called picture archive and communications systems (PACS) radiologic imaging systems – are all but invisible to security monitoring systems and provide a ready platform for malware infections to lurk on hospital networks, and for malicious actors to launch attacks on other, high value IT assets. Among the revelations contained in the report: A malware infection at a TrapX customer site spread from a unmonitored PACS system to a key nurse’s workstation. The result: confidential hospital data was secreted off the network to a server hosted in Guiyang, China. Communications went out encrypted using port 443 (SSL) and were not detected by existing cyber defense software, so TrapX said it is unsure how many records may have been stolen.
  4. The Online Privacy Lie is Unraveling (TechCrunch) — The report authors’ argue it’s this sense of resignation that is resulting in data tradeoffs taking place — rather than consumers performing careful cost-benefit analysis to weigh up the pros and cons of giving up their data (as marketers try to claim). They also found that where consumers were most informed about marketing practices they were also more likely to be resigned to not being able to do anything to prevent their data being harvested. Something that didn’t make me regret clicking on a TechCrunch link.
Comment
Four short links: 19 May 2015

Four short links: 19 May 2015

Wrist Interactions, Kubernetes Open Source Success, Product Quality, and Value of Privacy

  1. Android Wear vs Apple Watch (Luke Wroblewski) — comparison of interactions and experiences.
  2. Eric Brewer on Kubernetes — interesting not only for insights into Google’s efforts around Kubernetes but for: There’s so much excitement we can hardly handle all the pull requests. I think we’re committing, based on the GitHub log, something like 40 per day right now, and the demand is higher than that. Each of those takes reviews and, of course, there’s a wide variety of quality on those. Some are easy to review and some are quite hard to review. It’s a success problem, and we’re happy to have it. We did scale up the team to try and improve its velocity, but also just improve our ability to interact with all of the open source world that legitimately wants to contribute and has a lot to contribute. I’m very excited that the velocity is here, but it’s moving so fast it’s hard to even know all the things that change day to day. Makes a welcome change from the code dumps that are some of Google’s other high-profile projects.
  3. We Don’t Sell Saddles Here — Stewart Butterfield, to his team, on product development and quality. Every word of this is true for every other product, too.
  4. What is Privacy Worth? (PDF) — When endowed with the $10 untrackable card, 60.0% of subjects claimed they would keep it; however, when endowed with the $12 trackable card only 33.3% of subjects claimed they would switch to the untrackable card. […] This research raises doubts about individuals’ abilities to rationally navigate issues of privacy. From choosing whether or not to join a grocery loyalty program, to posting embarrassing personal information on a public website, individuals constantly make privacy-relevant decisions which impact their well-being. The finding that non-normative factors powerfully influence individual privacy valuations may signal the appropriateness of policy interventions.
Comment
Four short links: 12 May 2015

Four short links: 12 May 2015

Data Center Numbers, Utility Computing, NSA Art, and RIP CAP

  1. We Used to Build Steel Mills Near Cheap Sources of Power, but Now That’s Where We Build DatacentersHennessy & Patterson estimate that of the $90M cost of an example datacenter (just the facilities – not the servers), 82% is associated with power and cooling. The servers in the datacenter are estimated to only cost $70M. It’s not fair to compare those numbers directly since servers need to get replaced more often than datacenters; once you take into account the cost over the entire lifetime of the datacenter, the amortized cost of power and cooling comes out to be 33% of the total cost, when servers have a three-year lifetime and infrastructure has a 10-15 year lifetime. Going back to the Barroso and Holzle book, processors are responsible for about a third of the compute-related power draw in a datacenter (including networking), which means that just powering processors and their associated cooling and power distribution is about 11% of the total cost of operating a datacenter. By comparison, the cost of all networking equipment is 8%, and the cost of the employees that run the datacenter is 2%.
  2. Microsoft Invests in 3 Undersea Cable Projects — utility computing is an odd concept, given how quickly hardware cycles refresh. In the past, you could ask whether investors wanted to be in a high-growth, high-risk technology business or a stable blue-chip utility.
  3. Secret Power — Simon Denny’s NSA-logo-and-Snowden-inspired art makes me wish I could get to Venice. See also The Guardian piece on him.
  4. Please Stop Calling Databases CP or AP (Martin Kleppman) — The fact that we haven’t been able to classify even one datastore as unambiguously “AP” or “CP” should be telling us something: those are simply not the right labels to describe systems. I believe that we should stop putting datastores into the “AP” or “CP” buckets. So readable!
Comment

Signals from Strata + Hadoop World 2015 in London

Key insights from Strata + Hadoop World 2015 in London.

People from across the data world came together this week for Strata + Hadoop World 2015 in London. Below we’ve assembled notable keynotes, interviews, and insights from the event.

Shazam already knows the next big hit

“With relative accuracy, we can predict 33 days out what song will go to No. 1 on the Billboard charts in the U.S.,” says Cait O’Riordan, VP of product for music and platforms at Shazam. O’Riordan walks through the data points and trendlines — including the “shape of a pop song” — that give Shazam hints about hits.

Read more…

Comment: 1
Four short links: 15 April 2015

Four short links: 15 April 2015

Facebook as Biometrics, Time Series Sequences, Programming Languages, and Oceanic Robots

  1. Facebook Biometrics Cache (Business Insider) — Facebook has been accused of violating the privacy of its users by collecting their facial data, according to a class-action lawsuit filed last week. This data-collection program led to its well-known automatic face-tagging service. But it also helped Facebook create “the largest privately held stash of biometric face-recognition data in the world,” the Courthouse News Service reports.
  2. The Clustering of Time Series Sequences is Meaningless (PDF) — Clustering of time series subsequences is meaningless. More concretely, clusters extracted from these time series are forced to obey a certain constraint that is pathologically unlikely to be satisfied by any data set, and because of this, the clusters extracted by any clustering algorithm are essentially random. While this constraint can be intuitively demonstrated with a simple illustration and is simple to prove, it has never appeared in the literature. We can justify calling our claim surprising since it invalidates the contribution of dozens of previously published papers. We will justify our claim with a theorem, illustrative examples, and a comprehensive set of experiments on reimplementations of previous work. From 2003, warning against sliding window techniques.
  3. Toolkits for the Mind (MIT TR) — Programming–language designer Guido van Rossum, who spent seven years at Google and now works at Dropbox, says that once a software company gets to be a certain size, the only way to stave off chaos is to use a language that requires more from the programmer up front. “It feels like it’s slowing you down because you have to say everything three times,” van Rossum says. Amen!
  4. Robots Roam Earth’s Imperiled Oceans (Wired) — It’s six feet long and shaped like an airliner, with two wings and a tail fin, and bears the message, “OCEANOGRAPHIC INSTRUMENT PLEASE DO NOT DISTURB.” All caps considered, though, it’s a more innocuous epigram than the one on a drone I saw back at the dock: “Not a weapon — Science Instrument.”
Comment
Four short links: 14 April 2015

Four short links: 14 April 2015

Technical Debt, A/A Testing, NSA's Latest, and John von Neumann

  1. Pycon 2015: Technical Debt, The Monster in Your Closet (YouTube) — excellent talk from PyCon. See also slides.
  2. A/A TestingIn an A/A test, you run a test using the exact same options for both “variants” in your test. That’s right, there’s no difference between “A” and “B” in an A/A test. It sounds stupid, until you see the “results.” (via Nelson Minar)
  3. NSA Declares War on General-Purpose Computing (BoingBoing) — NSA director Michael S Rogers says his agency wants “front doors” to all cryptography used in the USA, so that no one can have secrets it can’t spy on — but what he really means is that he wants to be in charge of which software can run on any general purpose computer.
  4. John von Neumann Documentary (YouTube) — 1966 documentary from the American Mathematical Association on the father of digital computing, who also is hailed as the father of game theory and much much more. (via Paul Walker)
Comment
Four short links: 25 March 2015

Four short links: 25 March 2015

Selling Customers, Classier Parsing, License Plates, and GitHub's CSS

  1. RadioShack’s Customer Data For Sale (Ars Technica) — trying to sell customer data as part of court-supervised bankruptcy.
  2. Classp: A Classier Way to Parse (Google Code) — The abstract syntax tree is what programmers typically want to work with. With class patterns, you only have two jobs: design the abstract syntax tree and write a formatter for it. (A formatter is the function that writes out the abstract syntax tree in the target language.)
  3. 4.6M License Plate Records From FOIA Request (Ars Technica) — from Oakland.
  4. Primerthe CSS toolkit and guidelines that power GitHub.
Comment: 1