Earlier this week I talked to writer and open source advocate Marco Fioretti, who has just announced the start of a study on open data for the European Union. Fioretti is a long-time supporter of open source software, which he wrote about in a chapter of the O'Reilly book Open Government. Fioretti also held a seminar about open and prorietary formats at Pisa's Sant'Anna School of Advanced Studies, a major European college in the field of economics.
Several problems impelled Fioretti to propose this study:
Releasing data that was collected for public use with public taxes is an appealing goal, but it faces innumerable hurdles. First, governments usually contract out both data collection and data analysis to private firms. Right away we're faced with the challenges of incompatible, proprietary, and even arbitrary formats, along with the firm's understandable preference to keep data to itself.
So government contracts must be very specific about the delivery of data that it commissions--and not just the data, but the formulas and software used to calculate results. For instance, if a spreadsheet was used in calculating the cost of a project, the government should release the spreadsheet data and formulas to the public in an open format so that experts can check the calculations.
On top of these barriers lie the usual difficulties of inconsistently recorded data, missing metadata such as dates and times, etc.
Fioretti hopes to shine a bit more light through all this smoke, finding out what data is being released right now and how businesses are using it. He's concentrating on local governments, first because of their importance, and second because the data will be more consistent that way. The structure of government projects and costs are more similar from one city to another--even across national EU borders--than from one national government to another.
One phase of the study will be a survey asking cities to give examples of how the release of local data has enable new business uses. For instance, he can take a region that used to sell digital road map information for thousands of dollars, but recently opened it up for free, and count the use of that data by businesses in that region before and after it was opened.
This phase concentrates on small businesses, because large ones usually can afford the fees charged for data that is not open.
Fioretti plans to use the results of this phase to demonstrate the value of his study and then launch a large follow-up phase. In that one, he'll just ask a large number of cities a few questions about which data they make open, with which licenses, and in which formats.
Knowing what data is available to the public does not in itself teach us anything about the economics of open data. But once Fioretti posts his results--in downloadable format under an open license, of course--other researchers can correlate the results with other information gathered about local businesses. So it may take several years to learn something practical, but the EU should be commended for trying to quantify the impact of what Tim O'Reilly calls government as a platform.
Two other new research projects in open government deserve publicity:
Open Source for America has begun a study to measure openness at a number of U.S. federal government agencies. O'Reilly Media was a founding member of OSFA and I volunteer for them. This survey is being carried out in close cooperation with all the major federal agencies.
First, OSFA is collecting public comment on the measures used. You can vote for the traits you consider important from now through June 15. OSFA will then send the most relevant questions to the federal agencies. Results depend, of course, on whether the agencies have collected the relevant information and how candid they are in reporting trends. But the questions will be in line with the administration's December 2009 Open Government Directive.
The Association of Health Care Journalists is asking journalists who work in health care to report their attempts to contact the Department of Health and Human Services, and how these contacts transpired.
Comments: 1
Jehnavi [11 June 2010 11:35 PM]
I would say the answer lies somewhere in between. There are clear differences between open government data and open source, but there are a slew of similarities. One of the big differences is that data doesn’t do anything, code does; this is to say, that the operational ability to modify data and “patch” it locally isn’t as important in data vs. code. But that doesn’t mean that users can’t “fork” open government data, especially if an agency, etc. is too stubborn to fix flaws in data sets themselves.
http://www.onlinenotebook.com/