Sat

Dec 22
2007

Andy Oram

Andy Oram

Collaborative translation for all types of web content provided by Worldwide Lexicon

Automated computer translation is decades away from producing acceptable content for most forms of communication. Truly global communication requires a combinatorial explosion of translations (e.g., if a continent has 30 languages, 900 translation efforts are needed for all people to communicate). This kind of knowledge intensity and scaling calls for a distributed, Wikipedia-like solution, and a colleague of mine--Brian McConnell--has created one in the form of the Worldwide Lexicon.

If you are comfortable in two or more languages, you can choose any Web page you think deserves a wider audience and translate it (or begin a translation). Other people can rate your translation or jump in and redo parts of it. The site includes an extension that works with major blog sites to let users display and edit translations to blogs.

What's new this week is that Brian has added a translation plugin (open source, like the rest of the software) that can be added easily to any web page allowing helpers to translate the links and other navigational or interactive tools of a site in addition to the content. The plugin displays a pencil on the page, and clicking on the pencil presents you with a form where you can type in a translation.

I like hearing from people around the world, whether it's traditional news outlets such as Worldpress offers or citizen journalists such as Global Voices. Worldwide Lexicon may help us understand one another--or at least may lead people to hold in check the natural arrogance we all have that our worldview is the only coherent one.

tags: web 2.0  | comments: 14   | Sphere It
submit:

 
Previous  |  Next

0 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/6164

Comments: 14

  Mike Linksvayer [12.22.07 01:32 PM]

"if a continent has 30 languages, 900 translation efforts are needed for all people to communicate"

!?

http://en.wikipedia.org/wiki/Pivot_language

  MLeo [12.22.07 04:07 PM]

Nice initiative.
So, I'm thinking, I'm Dutch, and Dutch is my native language, so hey, this site hasn't been translated to Dutch yet! So, let's translate.
Except that I can't.

I suspect I need to enter a password because it's the main site to prevent spam. But, why a clear text input field?

  Frenchy [12.23.07 06:13 AM]

(fr_FR)
== Traduction collaborative pour tout types de contenu Web fournie par Worldwide Lexicon ==
La traduction automatis√©e par ordinateur est √† des d√©cennies de produire du contenu acceptable pour la plupart des formes de communication. Une communication v√©ritablement mondiale exige une explosion de combinaisons de traductions (par exemple, si un continent compte 30 langues, 900 efforts de traduction sont n√©cessaires √† tous les gens pour communiquer). Ce type d'intensit√© de connaissances et d'appels √† grande √©chelle pour une solution distribu√©e, √† la mani√®re Wikip√©dia, et un de mes coll√®gues —¬†Brian McConnell¬†— en a cr√©√© une sous la forme du Worldwide Lexicon.
(to be continued...)

  Frenchy [12.23.07 12:04 PM]

(fr_FR)
Si vous êtes à l'aise dans deux ou plusieurs langues, vous pouvez choisir n'importe quelle page Web qui mériterai selon vous un public plus large et la traduire (ou commencer une traduction). D'autres personnes peuvent évaluer votre traduction ou se lancer et refaire des parties. Le site comprend une extension qui fonctionne avec les principaux sites de blogs pour permettre aux utilisateurs d'afficher et de modifier les traductions des blogs.
Ce qui est nouveau cette semaine, c'est que Brian a ajouté un module de traduction (en open source, comme le reste du logiciel) qui peut être ajoutée facilement à n'importe quelle page Web ce qui permet à des aides de traduire les liens et les autres outils de navigation ou interactifs d'un site, en plus du contenu. Le module affiche un crayon sur la page, et en cliquant sur le crayon cela vous présente un formulaire dans lequel vous pouvez entrer une traduction.
(to be continued...)

  Brian McConnell [12.23.07 01:49 PM]

Hi,

Anyone who wants to help contribute translations or help localize the service itself can contact me directly (bsmcconnell at geemail). We're busy working on the next version of our localization tool (it does support Dutch, it just does not appear in the list of choices, our mistake).

Thanks, and happy holidays.

Brian McConnell

  Tim O'Reilly [12.24.07 09:04 AM]

Seems to me that the ideal would eventually be "human enhanced" automatic translation. As automated translation gets good enough for a real first draft (rather than as now, when it's easier to start with a blank page), it will accelerate translation efforts like this.

  Small Business Marketing [12.25.07 06:03 AM]

This is a great idea. However, in some arenas, based on the ability of the users to communicate in writing in their own language, you wonder what kind of quality translation you'll be getting. Who'll be responsible for quality control?

  Brian McConnell [12.27.07 10:44 AM]

Tim,

We've looked closely at using machine translation to create rough drafts (we already have an interface developed for this). One of the things we learned from experts in the field is that MT really only works for closely related languages (Spanish to Catalan for example). In these cases, the form of the languages is similar enough that a computer can do word/phrase substitution and produce something readable. We will probably add this fairly soon as an experiment.

For distant languages (English-Spanish, English-Arabic for example), the form of the language is so different that automatic translation produces stilted text. It may be technically accurate in some ways, but it does not "sound" right to a native speaker.

One interesting finding was that people are forgiving of technical errors if the translation is written well. So for example, a native Spanish speaker translates a news story from English to Spanish but gets one or two things wrong. The reader will still see it as well written Spanish, and may not even notice the technical errors (or will be able to figure out what the person meant).

I do think MT will eventually improve but for now most translators prefer that we not use it, and focus on memory aids (translation memories, dictionaries, etc).

Brian McConnell

  thuan [01.22.08 01:26 PM]

Hi Andy. Traduwiki handles this job for various types of texts. It's a collective translation service that followed one simple rule: breaking short to long texts into smaller chunks, so that people could pick text portions they like to translate. Some like to start from the beginning, others prefer to jump to the conclusion or select ultra short sequence of phrases. And that doesn't matter because once completed, slices of text are put together to form the translation. Here: http://traduwiki.org

  thuan [01.22.08 01:28 PM]

Hi Andy. Traduwiki handles this job for various types of texts. It's a collective translation service that followed one simple rule: breaking short to long texts into smaller chunks, so that people could pick text portions they like to translate. Some like to start from the beginning, others prefer to jump to the conclusion or select ultra short sequence of phrases. And that doesn't matter because once completed, slices of text are put together to form the translation. Here: http://traduwiki.org

  thuan [01.22.08 01:32 PM]

Comment saving got jammed somehow. Sorry for that.

  Ana Ramire [01.24.08 07:37 AM]

I believe that this is an important idea for the future of translation. However is still not clear to me how are this collaborative projects going to keep consistency and Terminology consistency? The experience tells me that inconsistency could create misunderstandings and inaccuracies in a text. Is there any terminology database that every translator will share so everybody will use terminology consistently?

  danielle [02.24.08 03:30 AM]

Do you know any other collaborative translation initiative?
Daniellel

  Spanish Translation [09.09.08 03:11 PM]

I know of Cucumis.org as an additional collaborative translation project.

Post A Comment:

 (please be patient, comments may take awhile to post)






Type the characters you see in the picture above.

RECOMMENDED FOR YOU