• Print

White House to open source Data.gov as open government data platform

The new "Data.gov in a box" could empower countries to build their own platforms.

As 2011 comes to an end, there are 28 international open data platforms in the open government community. By the end of 2012, code from new “Data.gov-in-a-box” may help many more countries to stand up their own platforms. A partnership between the United States and India on open government has borne fruit: progress on making the open data platform Data.gov open source.

In a post this morning at the WhiteHouse.gov blog, federal CIO Steven VanRoekel (@StevenVDC) and federal CTO Aneesh Chopra (@AneeshChopra) explained more about how Data.gov is going global:

As part of a joint effort by the United States and India to build an open government platform, the U.S. team has deposited open source code — an important benchmark in developing the Open Government Platform that will enable governments around the world to stand up their own open government data sites.

The development is evidence that the U.S. and India are indeed still collaborating on open government together, despite India’s withdrawal from the historic Open Government Partnership (OGP) that launched in September. Chopra and VanRoekel explicitly connected the move to open source Data.gov to the U.S. involvement in the Open Government Partnership today. While we’ll need to see more code and adoption to draw substantive conclusions on the outcomes of this part of the plan, this is clearly progress.

Data.gov in a boxThe U.S. National Action Plan on Open Government, which represents the U.S. commitment to the OGP, included some details about this initiative two months ago, building upon a State Department fact sheet that was released in July. Back in August, representatives from India’s National Informatics Center visited the United States for a week-long session of knowledge sharing with the U.S. Data.gov team, which is housed within the General Services Administration.

“The secretary of state and president have both spent time in India over the past 18 months,” said VanRoekel in an interview today. “There was a lot of dialogue about the power of open data to shine light upon what’s happening in the world.”

The project, which was described then as “Data.gov-in-a-box,” will include components of the Data.gov open data platform and the India.gov.in document portal. Now, the product is being called the “Open Government Platform” — not exactly creative, but quite descriptive and evocative of open government platforms that have been launched to date. The first collection of open source code, which describes a data management system, is now up on GitHub.

During the August meetings, “we agreed upon a set of things we would do around creating excellence around an open data platform,” said VanRoekel. “We owned the first deliverable: a dataset management tool. That’s the foundation of an open source data platform. It handles workflow, security and the check in of data — all of the work that goes around getting the state data needs to be in before it goes online. India owns the next phase: the presentation layer.”

If the initiative bears fruit in 2012, as planned, the international open government data movement will have a new tool to apply toward open data platforms. That could be particularly relevant to countries in the developing world, given the limited resources available to many governments.

What’s next for open government data in the United States has yet to be written. “The evolution of data.gov should be one that does things to connect to web services or an API key manager,” said VanRoekel. “We need to track usage. We’re going to double down on the things that are proving useful.”

Drupal as an open government platform?

This Open Government Data platform looks set to be built upon Drupal 6, a choice that would further solidify the inroads that the open source content management system has made into government IT. As always, code and architecture choices will have consequences down the road.

“While I’m not sure Drupal is a good choice anymore for building data sites, it is key that open source is being used to disseminate open data,” said Eric Gunderson, the founder of open source software firm Development Seed. “Using open source means we can all take ownership of the code and tune it to meet our exact needs. Even bad releases give us code to learn from.”

Jeff Miccolis, a senior developer at Development Seed, concurred about how open the collaboration around the Data.gov code has been or will be going forward. “Releasing an application like this as open source on an open collaboration platform like Github is a great step,” he said. “It still remains to be seen what the ongoing commitment to the project will be, and how collaboration will work. There is no history in the git repository they have on GitHub, no issues in the issue tracker, nor even an explicit license in the repository. These factors don’t communicate anything about their future commitment to maintaining this newly minted open source project.”

The White House is hoping to hear from more developers like Miccolis. “We’re looking forward to getting feedback and improvements from the open source community,” said VanRoekel. “How do we evolve the U.S. data.gov as it sits today?”

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Open data impact

From where VanRoekel sits, investing in open source, open government and open data remain important to the administration. He said to me that the fact that he was hired was a “clear indication of the importance” of these issues in the White House. “It wasn’t a coincidence that the launch of the Open Government Partnership coincided with my arrival,” he said. “There’s a lot of effort to meet the challenge of open government,” according to VanRoekel. “The president has me and other people involved meeting every week, reporting on progress.”

The open questions now, so to speak, are: Will other countries use it? And to what effect? Here in the U.S., there’s already code sharing between cities. OpenChattanooga, an open data catalog in Tennessee, is using source code from OpenDataPhilly, an open government data platform built in Philadelphia by GIS software company Azavea. By the time “Data.gov in a box” is ready to be deployed, some cities, states and countries might have decided to use that code in the meantime.

There’s good reason to be careful about celebrating the progress here. Open government analysts like Nathaniel Heller have raised concerns about the role of open data in the Open Government Partnership, specifically that:

… open data provides an easy way out for some governments to avoid the much harder, and likely more transformative, open government reforms that should probably be higher up on their lists. Instead of fetishizing open data portals for the sake of having open data portals, I’d rather see governments incorporating open data as a way to address more fundamental structural challenges around extractives (through maps and budget data), the political process (through real-time disclosure of campaign contributions), or budget priorities (through online publication of budget line-items).

Similarly, Greg Michener has made a case for getting the legal and regulatory “plumbing” for open government right in Brazil, not “boutique Gov 2.0″ projects that graft technology onto flawed governance systems. Michener warned that emulating the government 2.0 initiatives of advanced countries, including open data initiatives:

… may be a premature strategy for emerging democracies. While advanced democracies are mostly tweaking and improving upon value-systems and infrastructure already in place, most countries within the OGP have only begun the adoption process.

Michener and Heller both raise bedrock issues for open government in Brazil and beyond that no technology solution in of itself will address. They’re both right: Simply opening up data is not a replacement for a Constitution that enforces a rule of law, free and fair elections, an effective judiciary, decent schools, basic regulatory bodies or civil society, particularly if the data does not relate to meaningful aspects of society.

“Right now, the problem we are seeing is not so much the technology around how to open data but more around the culture internally of why people are opening data,” agreed Gunderson. “We are just seeing a lot of bad data in-house and thus people wanting to stay closed. At some point a lot of organizations and government agencies need to come clean and say ‘we have not been managing our decisions with good data for a long time’. We need more real  projects to help make the OGP more concrete.”

Heller and Michener speak for an important part of the open government community and surely articulate concerns that exist for many people, particularly for a “good government” constituency whose long term, quiet work on government transparency and accountability may not be receiving the same attention as shinier technology initiatives. The White House consultation on open government that I attended included considerable recognition of the complexities here.

It’s worth noting that Heller called the products of open data initiatives “websites,” including Kenya’s new open government platform. He’s not alone in doing so. To rehash an old but important principle, Gov 2.0 is not about “websites” or “portals” — it’s about web services and the emerging global ecosystem of big data. In this context, Gov 2.0 isn’t simply about setting up social media accounts, moving to grid computing or adopting open standards: it’s about systems thinking, where open data is used both by, for and with the people. If you look at what the Department of Health and Human Services is trying to do to revolutionize healthcare with open government data in the United States, that approach may become a bit clearer. For that to happen, countries, states and cities have to stand up open government data platforms.

The examples of open government data being put to use that excite VanRoekel are, perhaps unsurprisingly, on the healthcare front. If you look at the healthcare community pages on Data.gov, “you see great examples of companies and providers meeting,” he said, referencing two startups from a healthcare challenge that were acquired by larger providers as a result of their involvement in the open data event.

I’m cautiously optimistic about what this news means for the world, particularly for the further validation of open source in open government. With this step forward, the prospects for stimulating more economic activity, civic utility and accountability under a global open government partnership are now brighter.

Related:

tags: , , , , ,
  • Dave Bucci

    To be truly open, it’s important for the whole stack to be open source. For instance, other government efforts have open-sourced portal software, but which relies on proprietary GIS software under the covers (notably ESRI). While that’s a perfectly valid architectural choice for a system, it limits the ability of groups to replicate and innovate using the software, because of the cost factors that are a barrier to entry.

    By simply building upon a truly open source stack (e.g., OS Geo), any group, large or small, can openly innovate upon the offering.

    • Ilkka Rinne

      When taking about open data or open access web services, even more important that the software stack being Open Source is that the interfaces are based on open standards. In the case of GIS those standards are developed by the Open Geospatial Consortium, W3C, OASIS etc.

      There should be no problem communicating between web services provided by ESRI and the ones by OS Geo if they both follow the same OGC standards.

      Ilkka Rinne