NSF Grant for translation of ASL to Speech

Sun

May 27
2007

listen

NSF Grant for translation of ASL to Speech

In news on the "one day we'll all be able to talk to each other" (even if we still fail to understand each other!) front, Ross Stapleton-Gray sent in this fascinating note about an NSF grant for translating American Sign Language to speech. From the abstract:

This Small Business Innovation Research (SBIR) Phase I research project will demonstrate the feasibility of developing a bio-electronic portable device that translates American Sign Language (ASL), a gestural language that has no written representation, to spoken and written English. The development of such a device implies design and refinement of mechanics and electronics, as well as writing several computer applications to integrate with extant computer programs that train and practice ASL. Development efforts proposed for this project will substantially propel this device toward commercialization. The instrumental part of the research aims to obtain a fully portable gesture capturing system in two versions: wired and [unwired]. The proposed research has three goals: (1) To determine feasibility for two-arm translation, (2) To determine capability to interface with ASL instructional software, and (3) To determine if the electronics can be made robust enough for consumer use. Achievement of these goals will require (a) modification of hardware and software previously developed to handle finger spelling (one hand) to handle ASL translation (two-handed) by consumers, and (b) development of a series of communication protocols and conventions to integrate the ASL instructional and translation software, which are currently standalone applications.
American Sign Language (ASL) is the native language of many deaf and speech impaired people in the United States and Canada and the second language for relatives and others who provide services to them, making ASL the fourth most widely used language in the U.S. As a gestural language, based on visual principles, it has no written representation. Despite how pervasively this language is used, there is no automatic device on the market that can translate ASL to spoken or written English (or any other sound-based language) in the same way that there are electronic dictionaries to translate English to other spoken languages. Development of this bio-electronic instrumentation will enable native ASL users to communicate instantaneously with English users for commonplace purposes. It is anticipated that it will have special value to multiply disabled deaf and other disabled (e.g., autistic, mentally retarded, aphasic) individuals for whom acquisition of English is a challenge. This instrumentation also has applications for rehabilitation, gaming, and robotics. The proposed instrumentation overcomes limitations posed by previous inventions that could not interpret palm orientation, an essential component for recognizing distinct signs, by using digital accelerometers mounted on fingers and the back of the palm.

One of the big questions we've been asking ourselves at Radar is when the revolution that hit gaming with the Nintendo Wii controller is going to hit other areas of computing, changing forever the way that we interact with our machines. It's pretty clear that the Minority Report UI is in our future -- that and more.

Confirming the idea that hackers are often playing around with this stuff first, Phil Torrone told me the other day that back when he was in advertising, he'd proposed to a client that they build an MP3 player controlled by an accelerometer -- just wave it around in patterns in the air to give it commands. He was ahead of his time (and probably ahead of the cost/performance/reliability of the hardware) but the point stands. And I remember when the Mac powerbooks first came with accelerometers. There were immediately lots of accelerometer hacks. Tom Igoe taught an accelerometer class at ITP back in the fall of 2005. But now things are getting serious.

A couple of years back at D, Daniel Simpkins of Hillcrest Labs demoed an amazing TV remote ("the ring") that used an inertial controller to turn TV-guide into an amazing interactive voyage. He told me recently that they are getting some traction (after a long period where the TV guys were resistant to the new possibilities.)

In some ways, accelerometer-based interfaces are just the tip of the iceberg. With the increasing power of speech recognition (check Nuance's trajectory, Google's interest in speech, and the recent acquisition of Tellme by Microsoft, etc.), we're heading for a number of crossover points, such that it's not out of the question that within a few years, the idea that you interact with a computer by typing at a keyboard is going to seem quaint.

tags: | comments: 8 | Sphere It
submit:

Previous | Next

0 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/5525

Comments: 8

GerardM [05.28.07 02:35 AM]

You are flatly wrong where you state that ASL does not have a written representation. SignWriting is a script that is accepted in ISO-15924 (Sgnw).

Try http://www.signwriting.org for some more information.

When you are able to recognise the signed ASL, it should be possible to make a copy into SignWriting as well. Having parsed the meaning from ASL, you still have the task to change the grammar.. This is not trivial.

Thanks,
GerardM

Tim O'Reilly [05.28.07 08:54 AM]

Gerard -- I didn't say that. The folks who published the abstract of their research grant did. But thanks for the info. Good to know.

My main point was not about ASL, but about the coming change in interfaces, with this research as a data point.

Walt Tetschner [05.28.07 07:26 PM]

Looks like you have joined the folks that have been inappropriately hyping speech recognition for the last 30+ years. You need to tell the caller that has just been through one of the “did you mean,,,,,” torture sessions about the “increasing power of speech recognition”. With all due respect to the Nuance stock price, I would say that this has more to do with the financial community liking a monopolist and being fed a bunch of nonsense and little to do with any significant improvement in the technology. You are not doing the industry a service by dreaming that speech recognition technology is ever going to reach a point where its robustness is acceptable to users. This gives the designers a rationale for torturing callers (it’s really only until the flaws are eliminated). If they had to accept the reality that this is the best that it gets, then we would have designs that apply the technology appropriately. The designers would seriously look at architectures that eliminate the need to abuse the caller and implement them. Sure - it will get better, but it’s still will not be good enough. You would not tolerate this sort of performance from your car, lawnmower, refrigerator, air conditioner, etc. Why should you tolerate it from a speech self-service system?

Walt Tetschner

Tim O'Reilly [05.29.07 05:17 AM]

Walt,

Don't confuse bad speech interfaces with the improved quality of the tools. FWIW, David Pogue dictates all his books and he is far and away our most productive author. And watching him answer his email with voice macros is truly amazing.

Just because there are a lot of bad implementations of IVR hell doesn't mean that speech isn't a real option for many applications.

I do think it's approaching a tipping point. And I've dipped in and tried it every few years. It's way better today than it was 5 years ago, so your idea that there's no progress is not true, in my opinion.

I've just got some new speech enabled phones to test. I'll report back shortly.

NancyF [05.31.07 12:46 PM]

I agree that the Holy Grail of sign language processing is text-to-sign and sign-to-text and that text-to-speech is relatively straightforward after that.

GerardM's claim that, because there is a written form (SignWriting) with an ISO standard, there is written representation ignores the sociolinguistic reality that this orthography is used in the US i) by an extremely small group of advocates, and ii) the technical adepts within academia. SignWriting is one of at least 4 orthographies that have been proposed or used over the past 45 years for ASL or other sign languages. These have suffered slow or no adoption for a number of reasons, including the fact that their appearance is so different from Roman characters in common use, and thus signers (may) feel stigmatized in yet-another way. He's correct, of course, that because ASL is not codified speech, the challenges of machine interpretation of the grammar encoded in gestures are indeed tremendous.

The statistic "ASL the fourth most widely used language in the U.S" unfortunately continues to be promulgated despite its debunking nearly 20 years ago by Mel Carter and Robbin Battison at a 1980 conference (familiarly known as "TSSLRT"). It's tempting to repeat it, as it brings greater attention to a deserving minority language and community.

The grant writers and authors of the press release instead can emphasize the challenges for their system and device to unpack many dimensions of movement (direction, speed, path, repetition and rhythm, force, or tension) all of which combine to convey syntactic content of signing messages. Successfully unraveling movement complexes, and finding the appropriate English correspondences, would be an accomplishment of monumental importance for natural language processing. Then there's the interesting issue of whether the users of this device will need to adjust their ordinary signing behavior to account for the non-capture of non-manual signals (eye gaze, brows, mouth, cheeks, head orientation) and body position, all syntactically relevant in ordinary ASL.

I do realize that Tim, you didn't write the press release. Just clarifying for your readers.

GerardM [06.03.07 11:36 AM]

SingWriting is the only of the scripts that allow for the use in a day to day situation. It is the only script that is actually taught in schools worldwide.

There are scientific papers that indicate that deaf children who learned to write their language first have an easier time to learn English .. for them it is a written language !!

SignWriting is used worldwide.. from the USA to Denmark to Nicaragua, Jordania and Brazil. It is nice that it is so nicely belittled. It is more relevant than we are lead to believe in this way.

Thanks,
GerardM

Prindle [09.10.07 01:21 PM]

Has anyone developed a text-reader that translates into signs? There are so many in the deaf community that struggle with written English (or any written language for that matter), and this would greatly aid them, even though it wouldn't be in correct ASL syntax. As we know it's been available for spoken languages for many years.

Philippe Dreuw [09.16.07 02:34 AM]

Hi, there exits also the HamNoSys writing concept for sign language and a few other.
http://www.sign-lang.uni-hamburg.de/Projects/HamNoSys.html

Signlangue-To-Text (i.e. Video-To-Text) for American Sign Language is one of our research projects at the RWTH Aachen University.
http://www-i6.informatik.rwth-aachen.de/~dreuw/database.html

Please have a look at the following video for an example of continuous sign language recognition for ASL:
http://www-i6.informatik.rwth-aachen.de/~dreuw/download/021.avi

We are also working on a statistical based text-to-signlanguage translation.

Further information is available here:
http://www-i6.informatik.rwth-aachen.de/~dreuw/publications.php