Uncommon Knowledge and Open Innovation: Building a Science Commons

The first session I attended today was John Wilbanks’ “Uncommon Knowledge and Open Innovation: Building a Science Commons” presentation. John talked about the process of establishing the Science Commons and how creating a science oriented commons presented unique challenges. John first pointed out that Metcalfe’s Law works for both networked computers and documents. But, he went on to extend the law to more general data as well — something I’ve believed in and espoused for a number of years now.

However, the science world hasn’t created efficient means of communicating knowledge as the net did for more general topics. So far, the science world has taken paper metaphors and made them digital, which doesn’t really enable new models of easy data sharing. John likened the scientific community to a stable system that has an “immune response” that is resistant to change. Prohibitive license agreements and patents create chilling effects that prevent efficient communication means from evolving.

John says that in the scientific community “there is no crowd”. On the net in general one can apply concepts of Wisdom of Crowds to all sort of problems, but the knowledge required to participate in scientific crowds is uncommon. This fact creates significant barriers to entry to create the types of innovations that we commonly find on the web. Creating a science commons presents a clear goal with clear benefits — open rights provide for a multiplicity of incentives. Commons become the infrastructure of innovation as we’ve seen on the web.

But sharing is hard, John says. If you want to share you have to make sure you have the rights to do so. In scientific communities you need to worry not only about copyrights, but also about patents and privacy. For instance, when a company wanted to give away a new strain of rice with increased Vitamin A to prevent blindness caused by Vitamin A deficiency, it took an amazing 4 years to give the rice away. In that time countless license agreements and patents had to be cleared before the rice could be given away.

And even once you have rightful access to data, you need to teach computers to understand things — computers need to understand the relationship between ideas. This is why people in the linked data space have started making knowledge addressable by assigning URI’s to knowledge. By making knowledge addressable you can start expressing relationships between the concepts. The end goal here is to bring the benefit of the web to databases. Right now you can’t add Facebook-like apps to your data, but FreeBase has started working on that.

Creating innovations commons for scientists is hard, but not impossible John says. There are no one-click concepts — instead if you want to work with stem-cells you need to negotiate a complicated license agreement. Once you finally have that, you need to start at the beginning and make your tools. Imagine if you wanted to cook a meal and you had to start by making your own cast-iron skillet! But some groups have some progress — for example the open access movement has managed to get over 1000 journals released under a Creative Commons license. Then, some scientists have figured out how to play with each other without lawyers and created zones of certainty where they know they are are in the clear when it comes to legal restraints. And scientists who cheat the system can expect to be punished in grant committees and peer reviews. One such area that John mentioned is the Proteome Commons.

Scientists have started working on expressing their data and knowledge in RDF. Unfortunately RDF presents many challenges and makes life quite a pain, but in the end the interconnection between the data makes it worth it. People have started to create ontologies for many scientific areas and its becoming possible to make structured queries of this data. This linked data allows for the construction of much more detailed queries that give more refined results than a Google search with a similar text-based query would. Structured data leads to collaborative question answering and this can save vast amounts of time to many people.

Finally, John shared a story of how open data can lead to pleasant surprises. A group of science hackers ignored the terms of service on a web site that offered a visualization of data and screen scraped the data. Then they proceeded to mash up the data with the Google Maps API and created a page that allowed the side-by-side visualization of the data. The hackers then went back to the company whose terms of service they ignored and showed them the mash up and managed to convince the company to open up the data. But, this concept isn’t really new to anyone who believes in open systems. The Open Source world has shown us that great unexpected things can happen when you open your source code to the world.

I found John’s talk quite interesting since I’ve been thinking about linked data concepts for a number of years now. With stars like Tim Berners-Lee taking on linked data concepts as his new mission we’re going to see much more useful and intelligent data emerge in the coming years. So far, the linked data killer application hasn’t appeared yet, but after hearing John’s talk, I wouldn’t be surprised if the scientific community comes up with the first killer app.

Thanks for the great talk John!

tags: , ,
  • A couple things:
    Despite being Creative Commons, many journals have the authors assign a non-commercial variant of the license. (ie all the Oxford Journals http://www.oxfordjournals.org/oxfordopen/ ). One of the greatest advances about the GPL is that you were allowed to use the code for commercial use, but you were required to share back. In choosing to go down the protectionist route of preventing commercial use, they are really tying down the progress of a true scientific commons.

    Many open source business models don’t currently play nice with the structure of companies designed by schools. A school owns a copyright to code you may write in a lab, so they can understand their share in a company that you spin off that sells that software. But if you GPL the software and then start a company to provide support, things become much more murky about the university’s share. So, schools become afraid to open source their software. It’s something that must be discussed.

  • Falafulu Fisi

    Robert Kaye said…
    However, the science world hasn’t created efficient means of communicating knowledge as the net did for more general topics.

    I have to disagree with that assertion here.

    First I believe that scientists means of communicating knowledge has been made much much easier by the advent of the internet. I don’t see that there is a major problem here, perhaps there is, but it is more likely a minor problem only.

    Second, the only time that I am aware that scientists protect their work (including data) is when the work or research is a commercial one, which is pretty much understandable.

    Last point, to the best of my limited knowledge, if a scientist requests collaboration with another one from a different institution, there is always free exchange of ideas, data, and software codes that used to generate the results described in one’s specific publication.