Open Data: Small Pieces Loosely Joined

Chris Messina’s blog entry about Gdata, entitled Building a Better Mousetrap, shows how people are waking up to the new platform wars. Chris singles out Google in his post:

This continued amalgamation of services behind the Google Account Authentication has consequences beyond the momentary outcry over Google’s supposed steamrolling of companies.

Business is business and competition is a threat to any member of an ecosystem, which is why you’ve got to keep innovating, adapting and bettering to survive. But it’s different when it comes to setting protocols and standards and the seamless moving of data in and out of disparate systems. When those protocols are closed or locked up or can be sealed off at any time, the competitive environment becomes very different.

The problem that I see is Google’s ability to shut out third party services once you’ve imported yourself into the proverbial gLife. No doubt there are feeds and the aforementioned GData APIs but it’s not an open system; Google decides which ports it wants to open and for whom. Think you’ll ever be able to cross-post calendar items from 30boxes to your Google Calendar? Only if Narendra strikes a deal on your behalf — even though it’s your data. Think you’ll ever be able to share your Picasa Albums with your Flickr account? Don’t bet on it. Or — or — how about sharing your Google search history with your Yahoo account? Or merging your buddy list between Orkut and Flickr? Not a chance.

I don’t know that Chris is right to point the finger at Google — I think that every one of the big platform players is doing the same thing, trying to gain competitive advantage by tying their services together. And Google, like Yahoo!, has done a whole lot to use open protocols as well as their own “embraced and extended” versions. And from conversations with Google management, I know that they think hard about the issue of how much to try to control, and how much to give up to keep the virtuous circle of the internet running. (I’ve had very specific conversations with them with regard to Google Book Search, and am convinced that Eric and other top managers really do understand the value of openness and interoperability, and want to maintain the same “neutral switchboard” position that Google has tried to find with web search.)

So as to Chris’ “Not a chance,” I reply: there is a good chance, if developers band together, think hard about what standardized protocols we need (and in which areas), and push for interoperability between systems in the area of who, what, when, where — which are some of the key data subsystems, if you like, of Web 2.0 (Identity, for people and objects, respectively, and calendaring, taking into account both place and time.)

But like Chris, I do think that there’s a battle going on for the heart and soul of the future internet.

I’ve been saying so for years. Long before I was calling it Web 2.0, in my talks about the future “internet operating system,” I always have had a slide called “A Platform Beats an Application Every Time”, in which I predict that the first wave of web applications will be replaced by a second wave of consolidation, which weaves it all together into a new platform. And I provide a view of two alternative futures, one symbolized by Tolkien’s “one ring to rule them all,” and the other by David Weinberger’s “Small Pieces Loosely Joined.”

I point out that while many companies have sought to be the new Microsoft, discovering their equivalent to Win32 as the one ring, there is another model, exemplified by Linux, open source software, and the open standards of the internet. “Small pieces loosely joined” is a great name for this architecture. It’s the current architecture of the web, but will it remain that way? Chris’ post is a great reminder that the future may not be like the past, and that we need to work hard to maintain interoperability as Web 2.0 matures.

In a brainstorming session that we held at O’Reilly the other day, about Web 2.0 design patterns, I made a statement that surprised even me (in the way that you can sometimes say something you hadn’t realized you thought.) I was noting how rich the brainstorming was for patterns and examples supporting the principle of “harnessing collective intelligence,” while the brainstorming regarding “data is the Intel inside” was weak, and mainly coming from me. I said, “people really need to pay more attention to this area. Harnessing collective intelligence is the principle that has opened the web 2.0 era, but data as the Intel inside is the one that will close it down.”

As databases built by collective action get to the point of increasing returns, one or more de facto standards will emerge, and may well be owned by one company. They will ultimately, regardless of good intentions, most likely use that market power to limit competition and protect their position. The only defense against it is a vigorous pursuit of open standards in data interchange.

See also Tim Bray’s report on the Open Data portion of my recent OSCon keynote, as well as Four Big Ideas About Open Source, where I wrote:

4. Open Data. One day soon, tomorrow’s Richard Stallman will wake up and realize that all the software distributed in the world is free and open source, but that he still has no control to improve or change the computer tools that he relies on every day. They are services backed by collective databases too large (and controlled by their service providers) to be easily modified. Even data portability initiatives such as those starting today merely scratch the surface, because taking your own data out of the pool may let you move it somewhere else, but much of its value depends on its original context, now lost.

There’s some related background in the entry O’Reilly Radar Executive Briefing, and of course, in my papers, The Open Source Paradigm Shift and What is Web 2.0?.

  • Great to see you spending time on promoting open data again.

    Would be even more interesting if it could become the core focus of the web2.0 conferece/the web2.0 exhibition, etc.

    Possible – or too challenging in terms for those audiences?

  • As someone who works for a large telco which sits on huge amounts of data that is not open (curently) it is good to see that there is sound discussion on this topic in the ‘web’ world that I hope to try and transition to the ‘mobile’ world, if it takes a year or more on the web I have to wonder how much longer it will take in the mobile world though..

  • Thank you! I really like this article. It reminds me that the only constant is change. Closing essential systems ultimately fuels the succeeding wave of systems. I’m a technology promoter, and yet to me, the constancy of the web is people. People are always in the equation. People are resourceful and always adapt. Many attempts to control the web in the past have simply served as catalysts for great change. Just look at the way the web is mirrored. The web will never be ‘owned’ by Google or any other monolith.

  • Over a year later, Google and other big platforms are becoming more and more encompassing, and at the same time, applications which use these and other APIs are blooming all over.

    Even with the changes in the past five years I still struggle to put the small pieces together. Over the Internet things are loosely joined. On Earth, things are chained by gravity, economics, health.

    Musings. For me a platform is like my front door. I use it, close it behind me, and the world appears. To continue the analogy, moving is a pain and paying rent a necessity.

    Thanks. You are back on my reading list. (There’s a lot of competition.)