"transparency" entries

Four short links: 3 June 2014

Four short links: 3 June 2014

Machine Learning Mistakes, Recommendation Bandits, Droplet Robots, and Plain English

  1. Machine Learning Done Wrong[M]ost practitioners pick the modeling algorithm they are most familiar with rather than pick the one which best suits the data. In this post, I would like to share some common mistakes (the don’t-s).
  2. Bandits for RecommendationsA common problem for internet-based companies is: which piece of content should we display? Google has this problem (which ad to show), Facebook has this problem (which friend’s post to show), and RichRelevance has this problem (which product recommendation to show). Many of the promising solutions come from the study of the multi-armed bandit problem.
  3. Dropletsthe Droplet is almost spherical, can self-right after being poured out of a bucket, and has the hardware capabilities to organize into complex shapes with its neighbors due to accurate range and bearing. Droplets are available open-source and use cheap vibration motors and a 3D printed shell. (via Robohub)
  4. Apple’s App Store Approval Guidelines — some of the plainest English I’ve seen, especially the Introduction. I can only aspire to that clarity. If your App looks like it was cobbled together in a few days, or you’re trying to get your first practice App into the store to impress your friends, please brace yourself for rejection. We have lots of serious developers who don’t want their quality Apps to be surrounded by amateur hour.
Comment
Four short links: 30 May 2014

Four short links: 30 May 2014

Video Transparency, Software Traffic, Distributed Database, and Open Source Sustainability

  1. Video Quality Report — transparency is a great way to indirectly exert leverage.
  2. Control Your Traffic Flows with Software — using BGP to balance traffic. Will be interesting to see how the more extreme traffic managers deploy SDN in the data center.
  3. Cockroacha distributed key/value datastore which supports ACID transactional semantics and versioned values as first-class features. The primary design goal is global consistency and survivability, hence the name. Cockroach aims to tolerate disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention. Cockroach nodes are symmetric; a design goal is one binary with minimal configuration and no required auxiliary services.
  4. Linux Foundation Providing for Core Infrastructure Projects — press release, but interested in how they’re tackling sustainability—they’re taking on identifying worthies (glad I’m not the one who says “you’re not worthy” to a project) and being the non-profit conduit for the dosh. Interesting: implies they think the reason companies weren’t supporting necessary open source projects was some combination of being unsure who to support (projects you use, surely?) and how to get them money (ask?). (Sustainability of open source projects is a pet interest of mine)
Comment

Leading by example: two stories

When health care institutions are charging outrageous prices, we need to stand up and say, "That's insane."

I was struck recently by two stories in the New York Times. The first, “Bishops Follow Pope’s Example: Opulence Is Out,” tells how bishop after bishop, either inspired by the Pope’s example or afraid of being shamed for not doing so, is moving out of his expensive, newly renovated residence and emulating Pope Francis’ emphasis on living simply. “Francis has very definitely sent out a signal, and the signal is that bishops should live like the people they pastor, and they shouldn’t be in palaces.”

I contrast this in my mind with the “do as I say, but not as I do” style of leadership shown by the US Congress on health care, where the message of “bending the cost curve on health care,” and limits on “Cadillac plans” was for everyone else. Congress’ own gold-plated plan remained in place, despite posturing to pretend that members of Congress were in the same boat as everyone else.

But when the leaders themselves don’t lead, sometimes individuals stand up to be counted. Read more…

Comment: 1

Pursuing adoption of free and open source software in governments

LibrePlanet explores hopes and hurdles.

Free and open source software creates a natural — and even necessary — fit with government. I joined a panel this past weekend at the Free Software Foundation conference LibrePlanet on this topic and have covered it previously in a journal article and talk. Our panel focused on barriers to its adoption and steps that free software advocates could take to reach out to government agencies.

LibrePlanet itself is a unique conference: a techfest with mission — an entirely serious, feasible exploration of a world that could be different. Participants constantly ask: how can we replace the current computing environment of locked-down systems, opaque interfaces, intrusive advertising-dominated services, and expensive communications systems with those that are open and free? I’ll report a bit on this unusual gathering after talking about government.
Read more…

Comment: 1

Open data can drive partnerships with government

An exploration of themes in Joel Gurin's book Open Data Now.

As governments and businesses — and increasingly, all of us who are Internet-connected — release data out in the open, we come closer to resolving the tiresomely famous and perplexing quote from Stewart Brand: “Information wants to be free. Information also wants to be expensive.” Open data brings home to us how much free information is available and how productive it is in its free state, but one subterranean thread I found in Joel Gurin’s book Open Data Now highlights an important point: information is very expensive.

In this article, I’ll explore a few themes that piqued my interest in Gurin’s book: the value of open data, the expense it entails, the questions of how much we can use and trust it, and the role the general public and the private sector play in bringing us data’s benefits. This is not meant to be a summary or a review of Gurin’s book; it is an exploration of themes that interest me, inspired by my reading of Gurin.

Open, trustworthy, and useful

“Open data” occupies hierarchies of usefulness. One way of describing its usefulness is the structure of its presentation, as Gurin and others such as Tim Berners-Lee have pointed out. Much data is still fairly unstructured, like the reviews and social media status postings that people generate by the millions and that are funneled into eager consumption by marketing analysts. Some data is more structured, existing as tables. And finally, a tiny fragment can be reached through the RESTful APIs supported by libraries in every modern programming language. Read more…

Comment: 1

Big data and privacy: an uneasy face-off for government to face

MIT workshop kicks off Obama campaign on privacy

Thrust into controversy by Edward Snowden’s first revelations last year, President Obama belatedly welcomed a “conversation” about privacy. As cynical as you may feel about US spying, that conversation with the federal government has now begun. In particular, the first of three public workshops took place Monday at MIT.

Given the locale, a focus on the technical aspects of privacy was appropriate for this discussion. Speakers cheered about the value of data (invoking the “big data” buzzword often), delineated the trade-offs between accumulating useful data and preserving privacy, and introduced technologies that could analyze encrypted data without revealing facts about individuals. Two more workshops will be held in other cities, one focusing on ethics and the other on law.

Read more…

Comment

The technical aspects of privacy

The first of three public workshops kicked off a conversation with the federal government on data privacy in the US.

Thrust into controversy by Edward Snowden’s first revelations last year, President Obama belatedly welcomed a “conversation” about privacy. As cynical as you may feel about US spying, that conversation with the federal government has now begun. In particular, the first of three public workshops took place Monday at MIT.

Given the locale, a focus on the technical aspects of privacy was appropriate for this discussion. Speakers cheered about the value of data (invoking the “big data” buzzword often), delineated the trade-offs between accumulating useful data and preserving privacy, and introduced technologies that could analyze encrypted data without revealing facts about individuals. Two more workshops will be held in other cities, one focusing on ethics and the other on law. Read more…

Comments: 7

The public front of the free software campaign: part I

A review of my discussion with Free Software Foundation's Zak Rogoff.

At a recent meeting of the MIT Open Source Planning Tools Group, I had the pleasure of hosting Zak Rogoff — campaigns manager at the Free Software Foundation — for an open-ended discussion on the potential for free and open tools for urban planners, community development organizations, and citizen activists. The conversation ranged over broad terrain in an “exploratory mode,” perhaps uncovering more questions than answers, but we did succeed in identifying some of the more common software (and other) tools needed by planners, designers, developers, and advocates, and shared some thoughts on the current state of FOSS options and their relative levels of adoption.

Included were the usual suspects — LibreOffice for documents, spreadsheets, and presentations; QGIS and OpenStreetMap for mapping; and (my favorite) R for statistical analysis — but we began to explore other areas as well, trying to get a sense of what more advanced tools (and data) planners use for, say, regional economic forecasts, climate change modeling, or real-time transportation management. (Since the event took place in the Department of Urban Studies & Planning at MIT, we mostly centered on planning-related tasks, but we also touched on some tangential non-planning needs of public agencies, and the potential for FOSS solutions there: assessor’s databases, 911 systems, library catalogs, educational software, health care exchanges, and so on.) Read more…

Comments: 2
Four short links: 29 January 2013

Four short links: 29 January 2013

Data Jurisdiction, TimBL Frowns, Google Transparency, and Secure Tools

  1. FISA Amendment Hits Non-CitizensFISAAA essentially makes it lawful for the US to conduct purely political surveillance on foreigners’ data accessible in US Cloud providers. […] [A] US judiciary subcommittee on FISAAA in 2008 stated that the Fourth Amendment has no relevance to non-US persons. Americans, think about how you’d feel keeping your email, CRM, accounts, and presentations on Russian or Chinese servers given the trust you have in those regimes. That’s how the rest of the world feels about American-provided services. Which jurisdiction isn’t constantly into invasive snooping, yet still has great bandwidth?
  2. Tim Berners-Lee Opposes Government Snooping“The whole thing seems to me fraught with massive dangers and I don’t think it’s a good idea,” he said in reply to a question about the Australian government’s data retention plan.
  3. Google’s Approach to Government Requests for Information (Google Blog) — they’ve raised the dialogue about civil liberties by being so open about the requests for information they receive. Telcos and banks still regard these requests as a dirty secret that can’t be talked about, whereas Google gets headlines in NPR and CBS for it.
  4. Open Internet Tools Projectsupports and incubates a collection of free and open source projects that enable anonymous, secure, reliable, and unrestricted communication on the Internet. Its goal is to enable people to talk directly to each other without being censored, surveilled or restricted.
Comment
Four short links: 28 January 2013

Four short links: 28 January 2013

Informed Citizenry, TCP Chaos Monkey, Photographic Forensics, Medical Trial Data

  1. Aaron’s Army — powerful words from Carl Malamud. Aaron was part of an army of citizens that believes democracy only works when the citizenry are informed, when we know about our rights—and our obligations. An army that believes we must make justice and knowledge available to all—not just the well born or those that have grabbed the reigns of power—so that we may govern ourselves more wisely.
  2. Vaurien the Chaos TCP Monkeya project at Netflix to enhance the infrastructure tolerance. The Chaos Monkey will randomly shut down some servers or block some network connections, and the system is supposed to survive to these events. It’s a way to verify the high availability and tolerance of the system. (via Pete Warden)
  3. Foto Forensics — tool which uses image processing algorithms to help you identify doctoring in images. The creator’s deconstruction of Victoria’s Secret catalogue model photos is impressive. (via Nelson Minar)
  4. All Trials Registered — Ben Goldacre steps up his campaign to ensure trial data is reported and used accurately. I’m astonished that there are people who would withhold data, obfuscate results, or opt out of the system entirely, let alone that those people would vigorously assert that they are, in fact, professional scientists.
Comment