Sponsored By

'Cool Stuff' at SpeechTek 2014'Cool Stuff' at SpeechTek 2014

At the end of SpeechTek, I saw a lot of great uses for voice recognition, but I also saw a lot of vendors thinking they could solve any and all problems with voice recognition.

Henry Dewing

August 22, 2014

6 Min Read
No Jitter logo in a gray background | No Jitter

At the end of SpeechTek, I saw a lot of great uses for voice recognition, but I also saw a lot of vendors thinking they could solve any and all problems with voice recognition.

When I worked at Compaq Computer many years ago, Michael Capellas was the new CEO. His fondest wish and prime directive in product planning meetings was that we should make "cool stuff" so that our customers would love to buy from Compaq more than from our competitors. He would have had a ball at the recent SpeechTek conference in New York. It's no longer about speech recognition – a technology that has finally hobbled into the zone of market acceptance – it is now about the cool capabilities that speech embeds itself into: telematics, intelligent homes, digital assistants, telemedicine, and robotics.

Nuance was a powerhouse presence, occupying the third floor of the Marriott Marquis at Times Square with meeting rooms and whisper suites. The event was collocated with CRM Evolution and Customer Service Experience (So how many times did I hear voice enabled IVR? Too many, way too many) so one of the primary traditional use cases for speech recognition was front and center. I even heard one vendor talking about "human-assisted self-service": When the voice recognition faltered, the call was routed to a call center full of agents who were paid based on how many calls they could quickly listen to (on fast forward to reduce customer lag) and forward to the correct line. Gamification comes to work as a center full of Millenials are excited to compete for the highest position on the bell-shaped curve, for a few extra bucks a day. There was an array of other vendors from Phillips to XOWi and from IBM to West Interactive, all talking to and being understood by voice recognition engines.

Some of the cooler things I heard or saw included:

Robots – Two robots stick in my mind following my visit, both are for the home and neither vacuums the floor (maybe a later product enhancement). Both incorporated voice and facial recognition to make sure they were only doing what the robot owner wanted. The first was an 11-inch tall device for generic home use. It could take pictures (the speaker called them selfies, but the robot was taking pictures of people, not of itself); read bedtime stories to kids (not sure how that will fly with the kids or if they will nod off to sleep to a robotic recitation of Robin Hood); keep a family calendar; search the Web; store recipes, and lots more.

The other was tuned for home health care. It is able to interface with blood pressure systems, dispense medication, contact the doctor, adjust exercise/rehabilitation plans based on current status, etc.

Both robots were basically Internet-connected tablets with wheels, voice recognition capabilities, and some purpose-built software.

Digital Personal Assistants – There was a lot of talk about how a digital assistant resident on a smartphone (or a wearable device) can help find information, connect to people, set alarms/reminders, and simplify all the tasks we currently undertake with our phone. The most fascinating story was sandwiched between the lines on several presentations where I heard "Of course, I could do this by talking to my smartphone and activating an app that does [insert other voice activated actions performed by other devices]" It occurred to me that any discussion of microphones embedded in refrigerators or door locks was mostly pointless – just talk to your phone and let it translate and talk digitally the rest of the Internet of Things for you.

Geofencing - This is not a speech technology, but when Honeywell (and others) were talking about smart/connected homes, they talked about geofencing. When the homeowner approaches their smart home (with location reported by cellular/GPS location, or by connecting to a Bluetooth or Wi-Fi signal in the house), the lights come on, the HVAC is adjusted, and mood lighting is set! I had seen geofencing in video security, and even in some business applications (like rental car tracking or field force management), but this opened a whole new realm of possibilities and pitfalls. One audience member asked, "What if the data is breached and as long as a thief knows you aren't home, they can continue to plunder your belongings?" A farfetched example perhaps, but how is that geo location data going to be protected as it is transmitted across the network?

Far Field Voice Recognition – This is just fancy talk for eliminating background noise when a device (like a thermostat) can be instructed via voice to, "Increase the temperature by 3 degrees, I'm cold!" from across the room. Technologies like noise cancellation, and even voice biometrics were tossed around as ways to improve far field voice recognition, or using a trigger phrase like "Hello Dragon" which alerts a device to listen, and humans to quiet down! Far field voice could be used to tune the TV in the living room, or talk to a robotic nurse in your hospital room – as opposed to near field recognition like with a smartwatch or other wearable devices that work near your head and therefore the source of the voice it is trying to recognize.

There were a bunch of other interesting ideas – talking thermosets and TV remote controls, voice activated door locks and wearable devices that tell you when to exercise. I saw a few folks wearing smart watches (one fellow had on two of them!) and of course, Google Glasses. The Google Glass wearers seemed to know everyone was uncomfortable with their 'creepy' ability to film any of us at a moment's notice. They seemed much more aware than the first owners of Bluetooth headsets who proudly paraded around speaking at the top of their lungs like crazy Aunt Edna who only spoke to herself. Each of these devices has voice recognition built in, begging the question, how many microphones are too many? Can't we just voice enable our wearable technology (or connect to a cloud resident voice recognition-as-a-service capability) and use Siri or Cortana to talk to all these other connected devices? Do my refrigerator, front door, car (add the radio in my car), office desk, and sunglasses all need a microphone, too?

At the end of the day, I saw a lot of great uses for voice recognition, but I also saw a lot of vendors thinking they could solve any and all problems with voice recognition. I do not need or want voice recognition between me and my pet's evening bowl of Kibbles and Bits, or robots reading bedtime stories to my kids – it would devalue the experience, not enhance it. Because of the over-zealous pursuit of technology, this market is headed for some shakeout before too long, in my opinion.

About the Author

Henry Dewing

Henry Dewing has participated in every aspect of the telecommunications market. In addition to advising industry leaders as a management consultant and industry analyst, he has worked for service providers, equipment manufacturers, solutions developers, integrators and resellers in roles ranging across all aspects of the industry and has witnessed major technical and market changes, from stored program control and advanced intelligent networks to cloud and social business. Whether you call it Information & Communications Technology, Convergence, Collaboration Services, or Systems of Engagement, he has observed and taken advantage of the trends in the market.

Henry began his career in operations, engineering, and corporate finance at Verizon, and has worked in product management, marketing and business development for Compaq, Intel and Nokia. He has spent over a decade as an advisor to the industry as a Managing Consultant at AT Kearney and a Principal Analyst at Forrester Research and is currently a Senior Evangelist with Avaya, highlighting the successes customers are currently having with Cloud-based solutions based on Avaya platforms. He holds a BS in Physics-Engineering and Mathematics from Washington and Lee University and an MBA from the Darden School at the University of Virginia.