Uncommon Knowledge and Open Innovation: Building a Science Commons

The first session I attended today was John Wilbanks’ “Uncommon Knowledge and Open Innovation: Building a Science Commons” presentation. John talked about the process of establishing the Science Commons and how creating a science oriented commons presented unique challenges. John first pointed out that Metcalfe’s Law works for both networked computers and documents. But, he went on to extend the law to more general data as well — something I’ve believed in and espoused for a number of years now.

However, the science world hasn’t created efficient means of communicating knowledge as the net did for more general topics. So far, the science world has taken paper metaphors and made them digital, which doesn’t really enable new models of easy data sharing. John likened the scientific community to a stable system that has an “immune response” that is resistant to change. Prohibitive license agreements and patents create chilling effects that prevent efficient communication means from evolving.

John says that in the scientific community “there is no crowd”. On the net in general one can apply concepts of Wisdom of Crowds to all sort of problems, but the knowledge required to participate in scientific crowds is uncommon. This fact creates significant barriers to entry to create the types of innovations that we commonly find on the web. Creating a science commons presents a clear goal with clear benefits — open rights provide for a multiplicity of incentives. Commons become the infrastructure of innovation as we’ve seen on the web.

But sharing is hard, John says. If you want to share you have to make sure you have the rights to do so. In scientific communities you need to worry not only about copyrights, but also about patents and privacy. For instance, when a company wanted to give away a new strain of rice with increased Vitamin A to prevent blindness caused by Vitamin A deficiency, it took an amazing 4 years to give the rice away. In that time countless license agreements and patents had to be cleared before the rice could be given away.

And even once you have rightful access to data, you need to teach computers to understand things — computers need to understand the relationship between ideas. This is why people in the linked data space have started making knowledge addressable by assigning URI’s to knowledge. By making knowledge addressable you can start expressing relationships between the concepts. The end goal here is to bring the benefit of the web to databases. Right now you can’t add Facebook-like apps to your data, but FreeBase has started working on that.

Creating innovations commons for scientists is hard, but not impossible John says. There are no one-click concepts — instead if you want to work with stem-cells you need to negotiate a complicated license agreement. Once you finally have that, you need to start at the beginning and make your tools. Imagine if you wanted to cook a meal and you had to start by making your own cast-iron skillet! But some groups have some progress — for example the open access movement has managed to get over 1000 journals released under a Creative Commons license. Then, some scientists have figured out how to play with each other without lawyers and created zones of certainty where they know they are are in the clear when it comes to legal restraints. And scientists who cheat the system can expect to be punished in grant committees and peer reviews. One such area that John mentioned is the Proteome Commons.

Scientists have started working on expressing their data and knowledge in RDF. Unfortunately RDF presents many challenges and makes life quite a pain, but in the end the interconnection between the data makes it worth it. People have started to create ontologies for many scientific areas and its becoming possible to make structured queries of this data. This linked data allows for the construction of much more detailed queries that give more refined results than a Google search with a similar text-based query would. Structured data leads to collaborative question answering and this can save vast amounts of time to many people.

Finally, John shared a story of how open data can lead to pleasant surprises. A group of science hackers ignored the terms of service on a web site that offered a visualization of data and screen scraped the data. Then they proceeded to mash up the data with the Google Maps API and created a page that allowed the side-by-side visualization of the data. The hackers then went back to the company whose terms of service they ignored and showed them the mash up and managed to convince the company to open up the data. But, this concept isn’t really new to anyone who believes in open systems. The Open Source world has shown us that great unexpected things can happen when you open your source code to the world.

I found John’s talk quite interesting since I’ve been thinking about linked data concepts for a number of years now. With stars like Tim Berners-Lee taking on linked data concepts as his new mission we’re going to see much more useful and intelligent data emerge in the coming years. So far, the linked data killer application hasn’t appeared yet, but after hearing John’s talk, I wouldn’t be surprised if the scientific community comes up with the first killer app.

Thanks for the great talk John!