lca: Andrae Muys on RDF

On the first day of linux.conf.au, I ran into Andrae Muys. He hacks Java and RDF for clients who want semantic web hackery done. I have to admit that early Semantic Web hype put me off: it sounded too much like 1970s AI hype. Andrae was interesting, though, and completely free of the wide-eyed uncritical enthusiasm that characterized a lot of my early RDF engagement.

Andrae runs the Mulgara project, a Java RDF store. His goal is to be able to deal with 1E13 statements (aka tuples, facts, assertions) in three years. It’ll do 1E9 right now, next stop is 1E11. He refers to this goal as “3 Ts: three trillion triples”. A consortium is forming around Mulgara to make this happen: if it coalesces, Andrae will be the coder to make it happen.

He works as a consultant, and about 50% of his consulting involves semantic web work. One of his largest clients has a publication library, and mulgara is the backing database for their website of journal articles. He also sees a lot of projects looking to replace WSDL-style web services with RDF. Bioinformaticians have expressed interest, but they need to be able to store more than a billion triples and it’s still a way off.

Another client is integrating enterprise databases with RDF, and this is where it got interesting. Andrae wrote a mapping layer to let you run RDF queries across a mixture of RDF and ODBC data stores. The next version of Mulgara, 1.3, will ship in February and have this relational mapping in it. A quick Google search shows a lot of RDF-relational mappings going on, but the list of other mappings he had impressed me: Lucene, RSS, mbox, ID3.

I think it’s time I looked again at the world of RDF. They may yet be doing interesting things. I said as much to Andrae and he replied, “I am an engineer. In the early days it was scientists and logicians in RDF. Now the engineers have arrived, and we just want it to work and to scale.” Bold claim! If you have a favourite RDF package or practice, let me know in the comments.

tags: