In today’s GigaOm, there is a great piece on the future of speech recognition. See below and offer any comments that you may have!
Is iPhone’s Voice Control the Sound of Things to Come?
Posted: 07 Jul 2009 06:00 PM PDT
When it comes to designing intuitive, compelling user interfaces, Apple is hands-down the best. Starting with the Mac but most evident with each new generation of “i” products — iMac, iPod and iPhone — the company has demonstrated time and again what so many other device makers and mobile operators have failed to understand: It’s the UI, stupid! So when Apple features Voice Control in commercials for the newest iPhone 3GS, the mobile industry should sit up and take notice.
For those under a rock over the last month, Voice Control is Apple’s VUI (voice user interface) that allows you to make calls and control the iPod features on the iPhone 3GS by speaking, rather than pressing numbers or navigating via the touchscreen. None of the functions of Voice Control are particularly new, and their implementation on the iPhone has been met with mixed reviews. Still, Apple has an uncanny ability to recognize and deliver features that consumers find compelling — witness the incredible success of the touchscreen.
Apparently, Apple believes — as we at immr do — that speech recognition is the sound of things to come for mobile devices and applications (subscription required). Apple’s attention is a welcome development, and will undoubtedly accelerate the shift that began with the success of Goog411, Vlingo and other speech-enabled mobile apps. Despite the fact that mobile devices are well-suited for speech recognition — they do, after all, have microphones already built in — no OEM or operator to date has delivered a speech solution that is easy to use, much less promoted the feature to users as a key distinction. Apple is changing that, and other device makers and mobile operators that fail to keep up will be left behind in the competition for users who value simpler, more intuitive UIs.
User-friendly interfaces, such as the touchscreen, have fueled adoption and use of mobile apps. So why is speech likely to be the next big innovation in mobile user interfaces? Several factors are driving developments:
- While UIs are much improved, mobile devices and apps still demand considerable user attention — for example, viewing displays, entering text, navigating through the UI, etc. Speech-enabled solutions free users from hands- and eyes-on distractions.
- Platforms such as Spinvox are opening up APIs, making it easier for developers to incorporate speech into their applications.
- Companies such as Vlingo and Google have taken advantage of sophisticated technology and an enormous user experience base to dramatically refine speech-recognition results.
- Synthesized speech, which once sounded “computer generated,” can now be produced in a natural-sounding way; book publishers recently sued to prevent Amazon from including it on the latest-generation Kindle.
While the marriage of speech technologies and mobile is under way and irreversible, the transition won’t be a smooth one. First, many undoubtedly remember past speech applications that didn’t work very well. That perception will need to be overcome; implementing speech with simple applications, as Apple has done with Voice Control, is a good way to start. Secondly, some applications are more compatible with speech than others. Selecting and listening to music, for instance, is a natural application; the number of songs and artists is limited, which improves accuracy of speech recognition, and users typically listen to music in a closed environment or with a headset — hopefully with a built-in microphone — which reduces ambient noise and makes it easier for voice commands to be understood.
Much as RIM has carved out a loyal following by developing solutions optimized for email, there is a significant opportunity for operators and OEMs to incorporate speech into mobile devices and applications in a comprehensive way. Apple is leading the way, and others will likely follow suit.
Phil Hendrix is the founder and director of the Institute for Mobile Markets Research and a member of the GigaOM Analyst Network. His complete discussion of the impact speech technologies will have in mobile is available in the latest GigaOM Pro report, “How Speech Technologies can Transform Mobile Use” (subscription required).