3 Themes Emerging in Speech Technology3 Themes Emerging in Speech Technology
Innovation versus disruption is just one of the themes emerging in the speech technology space.
July 21, 2020
We would all rather be convening in-person for Enterprise Connect, but like other industry events, the 2020 edition will be virtual, running from Aug. 3-6. I’m returning with another state of the market update on speech technology in the enterprise, and as a preview, I’ve got a few highlights to share.
My presentation was initially developed for the in-person event back in March, and while COVID-19 took that option away, it also had an impact on the collaboration space, as we’ve all had to adapt to working from home. Zoom is the most visible beneficiary of this change, but in general, business has been good for collaboration vendors, and with that, we’re seeing more speech technology use cases.
The presentation I had in mind for March has been updated to reflect that, and if anything, the storyline around AI-driven speech tech in the enterprise has only gotten stronger. I’ll be presenting on day one, Aug. 3, at 4:15 pm, and all the details are here. I hope you can join me, and to persuade you to attend, here are three key themes I’ll be talking about then.
Theme 1: Innovation Versus Disruption
I’ve been using the term “New Voice” for a few years now to describe speech-based applications that have moved beyond telephony. Aside from talking in-person, telephony has long been the dominant channel for voice in the workplace. That pillar remains firmly in place, but its utility has been declining for years, and in the world of digital work, other voice applications have gained traction. Most of these are for person-to-person communication, but with advances in AI-driven speech technology, new layers are being added where voice has entirely new — and different — use cases.
To understand what’s driving these changes, it’s important to distinguish between innovation and disruption. It’s easy to use these terms interchangeably — much like “communication” and “collaboration” — but I see them being on a spectrum. In some cases, innovation leads to disruption, and in others, disruption leads to innovation, and I see both happening with speech tech, especially in the current pandemic climate.
During my talk, I’ll be exploring this further, and providing examples of both. For decision-makers, my message is to understand how these forces are different and the causal relationship between them. Both forces are driving speech tech in the enterprise, and I’ll talk about an added layer that provides a common set of implications to watch for.
Whether the force is innovation or disruption, both share an over-arching driver that I find holds for most technologies, speech tech included. That driver would be how both of these forces generally manifest themselves in the consumer world first, and from there, enterprise adoption follows. This has major implications for who will emerge as leaders in this space, and there’s a strong case to make that it won’t be familiar names we know from UCaaS.
Theme 2: Enterprise Use Cases
During last year’s talk, use cases were still emerging, but I did manage to showcase some examples. If you recall, the big buzz then was real-time transcription, and it’s a prime example of how AI has taken a mature space like speech recognition to an entirely new level. The buzz was strong enough that one of the companies I featured — Voicea — was soon after acquired by Cisco, and now, just a short year later, this capability has become de rigueur for all collaboration offerings.
I can’t say which 2020 applications may meet with similar success, but as the summary table below shows, my focus has moved on from real-time transcription to other use cases. All of these are quite promising, and together they reflect how rich the possibilities have become, largely thanks to advances in AI.
Some of these companies will be familiar from my 2019 talk, and I wanted to revisit Otter.ai in particular. Last year, I felt they had the coolest transcription offering, and if you get my newsletter, you’ll know that I use Otter.ai for my podcasts. I’ll be talking about them mainly to show how far they’ve taken things from basic real-time transcription to a deeper value proposition around what they call “dark data.”
I’ll be talking about each use case below in detail, emphasizing how the company mentioned is just one leading example in most cases. For most of these applications, there are several companies — both start-ups and majors — doing good work, and I could have just as easily cited any of them. I view this as further validation for how quickly the speech tech space is evolving.
As promising as these offerings may be, the broader B2B opportunity is far greater, led primarily by the contact center, where the needs are more pressing, and will no doubt be explored in other sessions during Enterprise Connect Virtual.
Theme 3 – Setting Objectives for Decision-Makers
This topic warrants a workshop of its own, but the main message here is that no matter how compelling these applications may be, IT needs to build a business case. As with anything driven by the shiny ball of AI, it’s easy to see these speech tech use cases as silver bullets for a litany of problems. Even with this small set of use cases cited above, the benefits need to be clearly articulated and tied to a specific set of objectives.
In my talk, I’ll distinguish between business-level objectives — automation and cost reduction — and those that are employee-based — namely productivity and engagement. All of these are valid, but speech tech applications will impact them in distinct ways. These distinctions matter because the benefits that come with each use case can give rise to unintended consequences.
To mitigate this, it will be important for decision-makers to assess both the benefits and risks. I’ll be addressing key considerations here, including privacy implications and the always-there AI “dark side.” Last year, I provided several examples that are still relevant today — as is most of that talk — but even then, I was only scratching the surface for where AI can do more harm than good.
We’re certainly seeing it all around us now, as cybercrime gets bolder and broader, enterprises and contact centers being rich targets for these actors — the risks are very real. I hope that stokes your curiosity rather than scare you away, and that you'll join me to learn about the state of speech tech in the enterprise.
This post is written on behalf of BCStrategies, an industry resource for enterprises, vendors, system integrators, and anyone interested in the growing business communications arena. A supplier of objective information on business communications, BCStrategies is supported by an alliance of leading communication industry advisors, analysts, and consultants who have worked in the various segments of the dynamic business communications market.