"voice interface" entries
The O’Reilly Design Podcast: Moving from GUI to VUIs.
Subscribe to the O’Reilly Design Podcast, our podcast exploring how experience design—and experience designers—are shaping business, the Internet of Things, and other domains.
In this week’s Design Podcast episode, I sit down with Tanya Kraljic, UX manager and principal designer at Nuance Communications. Kraljic recently spoke at OReilly’s inaugural Design Conference (you can find the complete video compilation of the event here). In this episode, we talk about the challenges of moving from graphical to voice interfaces, the voice tools ecosystem, and where she finds inspiration.
Here are a few highlights from our conversation:
We’re seeing a renewed emphasis on design at Nuance—actually, much like in the technology industry as a whole. We’ve always had great engineers who are building this very complex, very cutting-edge technology. Now, we’re augmenting that with a human-centered approach to product strategy and development, which I think we’re already seeing as accelerating innovation in our own company and, hopefully, it will also help create better and more usable solutions as voice becomes available in all these different technologies.
How intimately we talk to our stuff depends on what it’s done for us lately.
In the first post in this series, I mentioned that we’re getting used to talking to technology. We talk to our cell phones, our cars; some of us talk to our TVs, and a lot of us talk to customer support systems. The field has yet to settle into a state of equilibrium, but I thought I would take a stab at defining some categories of conversational interfaces.
There is, of course, quite a range of intelligent assistants, but I want to consider specifically different types of conversational interactions with technology. You might have an intelligent agent that can arrange meetings, for example, figuring out attendees’ availability, and even sending meeting requests. Certainly, that’s a useful and intelligent agent, but working with it doesn’t necessarily require any conversational interaction.
Classifying conversational interfaces
As usual with these kinds of things, the boundaries can be fuzzy. So, a particular piece of technology can have aspects of multiple categories, but here’s what I propose.
Voice interfaces: Understand a few set phrases
The most basic level of speech interactions are simple voice interfaces that let you control devices or software by speaking commands. Generally, these systems have a fixed set of actions. Saying a word or phrase is akin to using a menu system, but instead of clicking on menu items, you can speak them. You find these in cars with voice commands and Bluetooth interfaces to make phone calls or play music. It’s the same kind of system when you call into a phone tree that routes you to a particular department or person. Some of these systems allow for variations in how you say something, but for the most part, they will only understand words or phrases from a predefined list.
A look at the underlying technology and considerations for VUI design decisions.
Download our new free report “Design for Voice Interfaces,” by Laura Klein. Editor’s note: this is an excerpt from the report.
Before we can understand how to design for voice, it’s useful to learn a little bit about the underlying technology and how it has evolved. Design is constrained by the limits of the technology, and the technology here has a few fairly significant limits.
First, when we design for voice, we’re often designing for two very different things: voice inputs and audio outputs. It’s helpful to think of voice interfaces as a conversation, and, as the designer, you’re responsible for ensuring that both sides of that conversation work well.
Voice input technology is also divided into two separate technical challenges: recognition and understanding. It’s not surprising that some of the very earliest voice technology was used only for taking dictation, given that it’s far easier to recognize words than it is to understand the meaning.
All of these things — recognition, understanding, and audio output — have progressed significantly over the past 20 years, and they’re still improving. In the 90s, engineers and speech scientists spent thou‐ sands of hours training systems to recognize a few specific words. Read more…
When our stuff speaks to us, we exchange more than ideas.
People are really good at talking to each other. That shouldn’t be too surprising. Conversation among human beings has evolved over a very long period of time — and now we’re starting to talk to our stuff, and in some cases, it’s talking back.
Asking Siri (or Cortana or Google Now) some simple questions is just the beginning of what’s coming. In fact, we’re in the midst of a significant shift in voice and conversation technology. Companies like Amazon, Facebook, and Google are falling over each other to hire researchers and acquire related companies, and they are starting to use this talent in new and interesting ways.
This is the first post in a series of articles I’ll use to explore speech and conversational interfaces. The subject will be dialog systems in general, with a focus on the intelligent interfaces we can expect to see more of in the future. Other topics could include:
- Design considerations for spoken language systems
- Emerging research in the area
- Changes to how we interact with technology plus the social impact they might have
If you’re someone with a finely tuned hype radar, some skepticism about just how good these technologies might be is understandable. Most of the speech-to-text and automated telephone interactions available up to this point have been frustrating to use. People regularly share tips for short-circuiting interactive voice response (IVR) trees (I hear swearing helps!). And even Siri can seem clueless a lot of the time. Read more…