Tue

Aug 25
2009

Andy Oram

World Wide Lexicon Toolbar changes the reading experience for the other 99% of web pages

by Andy Oram | @praxagoracomments: 8

Brian McConnell's latest coding effort, World Wide Lexicon Toolbar, meets my criterion for a piece of critical infrastructure: after two days with it I can't get along without it, and I plan to avoid any browser that doesn't have it installed.

Brian is a highly adaptive programmer. With roots in the telecom industry and several start-ups on his resume, he also wrote Beyond Contact: A Guide to SETI and Communicating with Alien Civilizations for O'Reilly. The World Wide Lexicon project he's been working on for the past several years is again something totally different.

Install the add-on (currently experimental) in Firefox 3.5 or higher and visit a page in some language other than your default. Before your eyes, headings and text change into your native language. You can get similar effects by submitting the page to a popular translator such as Google (which is one of the tools used behind the scenes by the WWL toolbar), but the instantaneous effect of the toolbar makes you feel closer to the people whose sites you visit around the world.

There are several languages that I know well enough to get the gist of a page, but where I miss some of the details and get frustrated by gaps in my vocabulary. Therefore, I set the WWL toolbar to "Bilingual view," so each block element of the original text is shown together with its translation. The bilingual view is considerably less attractive, because it swells the size of each block element, but I can tell already that it will improve my language skills quickly.

WWL is designed for volunteer translations. If it becomes more popular, people will submit translations that are much more accurate than the machine-generated ones the WWL must fall back on currently.

What's the process behind this new dimension to web browsing? McConnell let me in on some of the magic.

Volunteer translations

McConnell invented WWL several years ago with the core notion of encouraging people to translate web pages they thought should get a wider audience. When he first told me about the idea, I was skeptical that he would get many volunteers. But then I heard of other volunteer translation efforts. For instance, there's a whole subculture of people who write subtitles for popular Hollywood films. This runs afoul of copyright law, of course (and so do the copies of movies they're attached to, probably) but they show the lengths to which crowdsourcing has progressed in the translation area.

FLOSS Manuals, a project I do volunteer work for, also finds dozens of people willing to translate its open source documentation.

McConnell's first set of tools were designed to facilitate on-the-fly translations. Web designers could enhance their web sites by downloading from the WWL site some JavaScript that made each text element on the page editable. (I blogged about this in December 2007.) The paste-in displayed a little pencil icon, signaling to viewers that they could do instant translations. All they would have to do was click on an element, and a text box would pop up where they could enter their translation. The web site would then register the translation with the central WWL site.

World Wide Lexicon API

The WWL API covers the entire life cycle of a translation: registering a translation, rating translations for quality, searching for a translation of a particular page into a particular language, and retrieving a translation. Queries can specify a minimum rating.

Toolbar

The latest achievement of the WWL project is the toolbar officially released yesterday. It determines the user's native language through settings in the browser. When each page is visited, the toolbar uses the domain name and various tests on the text to make a guess about its language.

The toolbar then issues an API query to see whether any human translations exist. If so, it displays the translations with a light yellow or green background.

If no one has made a human translation (which is usually the case so far) the toolbar resorts to well-known machine translation services. It can make use of Google Translate, Apertium, and Moses, each of which offers an API, and will also query Babelfish when its API is ready. Machine translations are displayed with a light blue or grey background.

The progressive translation used by the toolbar is also interesting. It starts with the first 10 or 20 elements, then translates heading tags (<H1>, etc.), then the larger texts, and ultimately every element on a page. (I displayed one page that embedded a Google ad, and the translator recognized and translated that text too.) McConnell is working on making the various translations run in parallel. Because translation changes the sizes of elements, the toolbar makes various accommodations to display the page as attractively as it can.

In short, WWL is a cool combination of mash-ups, existing services, crowdsourcing, and Ajax. I'm sure that in a year's time I'll think back to its appearance today and be shocked at how primitive it was. But it will remain a transformative tool for me.

tags: Brian McConnell, community, crowdsourcing, documentation, Firefox add-on, peer production, publishing, wealth of networks, wisdom of crowds, World Wide Lexicon, WWLcomments: 8
submit:

 
Previous  |  Next

0 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/9714

Comments: 8

Enxaqueca [2009-08-25 03:49 PM]

This is superb! Can´t wait to try...

Bruce [2009-08-25 09:24 PM]

Excellent news. It's great to see this progression of the WWL project.

bowerbird [2009-08-26 02:47 AM]

it's very very rare indeed these days that
something impresses me with the smooth
-- a combination of radical, clever, useful,
thoughtful, challenging, and cool -- _but_
this project most definitely has the smooth.

brian mcconnell, get ready to receive your
macarthur, that's what the smooth can do.

-bowerbird

Ajeet [2009-08-26 03:38 AM]

Despite its obvious benefits, it will still face a steep challenge in becoming popular. I do not see this becoming a standard, by any standards.

Bertil Hatt [2009-08-26 08:38 AM]

I might have missed something, but doesn't Google translate offer a similar feature? It's not a stand alone browser, you have to go through their webiste, etc. but you can navigate with it.

Avi [2009-08-26 09:36 AM]

For us that are fluent in multiple languages it would be nice to have is the ability to specify a list of languages not to translate.

Kevin [2009-08-27 02:38 AM]

@avi: post a review on the Firefox addons page ;-)

@Bertil Hatt: Google only offers translation through Google translate. The WWL offers through any engine which has an API (eg. Apertium, which supports many of the minority languages not supported by Google). Additionally, the WWL shows user-created translations! Thus users can easily machine-translate a web page, and then some of them might fix a typo here or a grammar error there, just as if the whole Web were a wiki! Also, these translations will be available through the WWL API for the creators of machine translation systems, providing more data for improvement (especially important for free and open source projects like Apertium and Moses, which rely on freely available translational data and can't (easily) make deals with publishers etc...)

Chris Blow [2009-09-03 09:53 PM]

WWL is quite useful if you are running an Open Source multi-lingual news site like Meedan: http://meedan.net

I work for Meedan; for us the advantage is that you can more easily operate an translation workflow. It's quite a beautiful system IMO given that it is totally Open and distributes the translation process across multiple servers using a sensible API. The point is that we can share translations more easily than with a centralized server -- and because it parses the front end paragraph by paragraph, the the sources you are translating can rewrite parts of stories without losing your translation work.


For application UI translation, we still manage private translations, but for us having an API for our news translation team to work is really great. If you use WWL you can get all of our translations (and translation metadata, including translation ratings) for free.

Post A Comment:

 (please be patient, comments may take awhile to post)






RECOMMENDED FOR YOU