lca: Andrae Muys on RDF
On the first day of linux.conf.au, I ran into Andrae Muys. He hacks Java and RDF for clients who want semantic web hackery done. I have to admit that early Semantic Web hype put me off: it sounded too much like 1970s AI hype. Andrae was interesting, though, and completely free of the wide-eyed uncritical enthusiasm that characterized a lot of my early RDF engagement.
Andrae runs the Mulgara project, a Java RDF store. His goal is to be able to deal with 1E13 statements (aka tuples, facts, assertions) in three years. It'll do 1E9 right now, next stop is 1E11. He refers to this goal as "3 Ts: three trillion triples". A consortium is forming around Mulgara to make this happen: if it coalesces, Andrae will be the coder to make it happen.
He works as a consultant, and about 50% of his consulting involves semantic web work. One of his largest clients has a publication library, and mulgara is the backing database for their website of journal articles. He also sees a lot of projects looking to replace WSDL-style web services with RDF. Bioinformaticians have expressed interest, but they need to be able to store more than a billion triples and it's still a way off.
Another client is integrating enterprise databases with RDF, and this is where it got interesting. Andrae wrote a mapping layer to let you run RDF queries across a mixture of RDF and ODBC data stores. The next version of Mulgara, 1.3, will ship in February and have this relational mapping in it. A quick Google search shows a lot of RDF-relational mappings going on, but the list of other mappings he had impressed me: Lucene, RSS, mbox, ID3.
I think it's time I looked again at the world of RDF. They may yet be doing interesting things. I said as much to Andrae and he replied, "I am an engineer. In the early days it was scientists and logicians in RDF. Now the engineers have arrived, and we just want it to work and to scale." Bold claim! If you have a favourite RDF package or practice, let me know in the comments.
tags: open source
| comments: 6
| Sphere It
submit:
0 TrackBacks
TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/5156
Comments: 6
I'm the author of the Redland RDF Libraries and I work for Yahoo!'s Media Group in Sunnyvale, California.
We're using RDF at the "data + a little OWL" level for a bunch of things. It's part of the backend of a couple of real and live sites - Yahoo! Food, Yahoo! TV, with several more coming soon. I mentioned this in Semantic Web Hype on my blog back in November.
RDF is being used for its flexible and dynamic linking of data and metadata along with graph-like query of the result. At present this is internal and you cannot see the RDF files on a yahoo site domain.
This quarter we are designing and coding for larger projects with a lot more triples, but that is not yet public.
Finally, Yahoo! has agreed to contribute some of the code changes to Redland made during this work back to the open source Redland project.
We have developed a two-factor authentication solution. We use RDF to auto-populate the one-time passcode for the user into the password field in our Firefox plugin. As an added bonus, our token client will validate the SSL cert for the user, so: no phishing sites, no fat-fingered passcodes. RDF makes it very user friendly.
For anyone who hasn't heard of Mulgara, it used to be called Kowari. We had some problems with the company that bought the "initial contributor" right under the MPL, so we forked, went to a new license, and renamed the system Mulgara. It can already do a lot more than Kowari, and more is coming.
In defense of myself and a few other people... Andrae won't be the only one doing the work! We also appreciate any new contributors!
I don't know whether it counts as a package or practice, but Planet RDF is definitely a favourite...
My adtech group at IBM has recently open-sourced an enterprise RDF store featuring authentication&permissions, distributed clients with offline caching of subsets of data, graph versioning (to provide a full auditable trail of data updates), and more. It's known as the IBM Semantic Layered Research Platform and it can be found at: http://ibm-slrp.sourceforge.net/
I couldn't agree more with the observations that the engineers seem to be arriving at the semantic-web party. I'm a latecomer myself and an engineer, so the claim rings true to me.
Lee
Post A Comment:
STAY CONNECTED
RECENT COMMENTS
- Lee Feigenbaum on lca: Andrae Muys on RDF: My adtech group at IBM ...
- Danny on lca: Andrae Muys on RDF: I don't know whether it...
- Paul Gearon on lca: Andrae Muys on RDF: For anyone who hasn't h...
- Nick Owen on lca: Andrae Muys on RDF: We have developed a two...
- Dave Beckett on lca: Andrae Muys on RDF: I'm the author of the R...
- Ian Davis on lca: Andrae Muys on RDF: Hi Nat, You should come...
Ian Davis [01.15.07 06:11 AM]
Hi Nat, You should come and take a look at what we're doing at Talis with our platform. The core of this is a very large scale storage engine based on RDF and full-text indexing with open APIs. We have a few application examples such as Cenote ( http://cenote.talis.com ) which is entirely driven from the open RDF APIs such as this one http://api.talis.com/bf/stores/ukbib/items
More docs are on our developer network at http://www.talis.com/tdn/platform
and we blog on using the platform at http://blogs.talis.com/panlibus/