Sponsored By

Enterprise Connect Preview: The State of Enterprise Speech TechEnterprise Connect Preview: The State of Enterprise Speech Tech

Some speech tech applications will be familiar, but newer, more transformative applications are on the way, and IT leaders need to consider the bigger picture here.

Jon Arnold

March 11, 2024

7 Min Read
Enterprise Connect Preview: The State of Enterprise Speech Tech
Image: Ramcreative - stock.adobe.com

Speech technology is a wide-ranging topic, and as with many other things now, it’s becoming driven and defined by artificial intelligence (AI). While AI has arguably become the most over-used term in tech, it continues to take speech tech to new levels that go well beyond legacy technology, and in far less time. This is my sixth year doing an annual update on speech tech for the enterprise at Enterprise Connect, and the changes since 2023 are the most profound since I’ve started tracking this space.

My updates have always focused on enterprise use cases of speech tech; not just because it aligns with the main themes on tap at Enterprise Connect, but also because they’re less familiar than use cases in the contact center, which is where most of the vendors are focused. That focus is certainly justified: enterprise use cases are organization-wide, and when powered by AI, they are taking on a bigger role than just speech transcription or virtual assistants.

Some speech tech applications will be familiar, but newer, more transformative applications are on the way, and IT leaders need to consider the bigger picture here. Speech tech applications that are now part of UCaaS and other productivity tools will remain widely-used, but AI has also moved on to use cases for speech in the enterprise that go beyond what IT may currently be thinking. This is what I’ll be covering during my Enterprise Connect session, and the following is a preview for what you can expect when joining us.

 

Two Big Changes Since 2023

TLAs – i.e. three-letter acronyms -- are everywhere with AI. What conversational AI (CAI) was to 2022, and generative AI (GAI) was to 2023, large language models (LLM) are to 2024. All of these are integral to the AI story, and no doubt something new will emerge for 2025. I have reviewed the importance of CAI and GAI in earlier updates, and although it’s only March, it seems like we’ve already had multiple waves of evolution for LLMs in 2024.

Change with AI is happening faster than anyone can absorb it – for buyers, sellers, and especially end users. On a technology level, the first change from last year is that LLMs are the current driver of speech tech innovation, and LLM needs to be in the lexicon for IT leaders. (I’ll leave it at that for this article, as the LLM space quickly becomes a rabbit hole of AI geekspeak for those who really want to know more.)

The second big change for 2024 is how advanced language-based AI technologies have become. More specifically, this is about the core trio of NLP, NLU and NLG. This is the branch of AI that is focused on language, and has the most direct impact on speech tech. Until recently, the first two provided the foundation for the most commonly used speech tech applications -- natural language processing (NLP) and natural language understanding (NLU).

These are the tools that enable humans and machines to communicate with each other, providing the bridge for AI to bring new forms of automation and efficiency to the workplace. The third form, however, may well become the most profound – natural language generation (NLG). Rather than simply transcribe or translate speech, NLG uses Machine Learning capabilities to generate content in ways that were entirely unimaginable a few short years ago. This is what gave rise to Chat GPT and other GAI innovations which was the dominant innovation from 2023, but this current technology is just a taste of what’s coming.

 

State of Current Applications

These two changes alone could easily fill out our session at Enterprise Connect, but they only set the stage for how IT leaders need to be thinking about enterprise speech tech. The core use cases are now well-established and baked into all the major UCaaS platforms -- real-time translation and transcription, automated meeting summaries and various flavors of virtual assistants to help manage our schedules and workflows.

More recent use cases are based on GAI (with NLP being the underlying technology), which workers now use to automate short-form written communication like emails, along with generating longer-form content like blog posts or reports. Speech plays two roles in all of this, with the first being the mode used for humans to interact with machines to do their bidding. Without this, speech would remain entirely human-based, and AI would move on to other use cases that are less dependent on human input.

Secondly, with most workplace communication being speech-based, enterprises are rushing now to capture as much of this as possible in digital form to help feed their AI engines – LLMs to be more precise. The further along they get with this, the more human-like these GAI outputs will be, making these technologies indispensable for driving workplace productivity.

Despite the big leaps in AI, in terms of current state for enterprise speech tech, the story has changed little over the past few years. Workers are still learning to adopt these real-time translation and transcription, automated meeting summaries and various flavors of virtual assistants. For all the examples cited above, there will be continuous improvement rather than totally new applications, so the expectation for IT should be to support them to help make workers more comfortable with AI.

This will mean support for a broader range of languages and dialects. For GAI, this will mean more accurate understanding of language, context, intent, etc. – and this represents the power of ML to keep improving the larger the data sets become. These kinds of refinements are critical for establishing trust and adoption for AI, as many challenges and blind spots remain that could undermine any degree of business value that speech tech can bring to the enterprise.

 

Implications for IT leaders

There are bigger forces at play beyond these use cases for how AI is shaping speech tech in the enterprise, and we’ll address that bigger picture during our session.

More than anything else, IT needs to develop an AI strategy for the organization. The technologies are evolving too quickly with too big of a potential impact to manage with a patchwork, reactive approach. Enterprise speech tech needs to viewed in a more holistic context as one of several use cases where AI can provide new business value.

For example, UCaaS-based speech tech applications have undeniable value for workplace productivity, but there also promising line-of-business use cases such with HR for hiring efficiencies, Marketing for content generation, or Legal for compliance monitoring. On a more horizontal scale, knowledge management could well become the most valuable use case of speech tech, since this benefits the entire organization.

Moving from strategic implications to tactical considerations, IT must also account for the numerous challenges associated with AI – not just for the workplace or contact center, but the business overall when it comes to speech tech. There are needs that IT has not had to consider before, but are essential for deriving sustainable ROI. Prime examples would be transparency for speech-based inputs used to build a knowledge base, mitigating bias, protecting personal privacy, security safeguards against deepfakes, and managing against plagiarism, copyright infringements, misinformation, etc.

No technology is perfect or immune for risk, but perhaps none has shown as much potential as AI. Speech tech in particular is a key cog, as it straddles the divide between humans and machines, and there should be no doubt about their futures being further intertwined. The future is coming fast, and if you like this preview, I think you’ll love our session, and I hope to see you there. I’ll be joined by speakers from Cognigy and RingCentral, and you can check out the rest of the details here.

Jon Arnold will be in the session Enterprise Speech Technology Update on Thursday, March 28 at 9 am EST. See you there!

Enterprise Connect 2024 will be held at the Gaylord Palms in Orlando, FL, from March 25-28. Preview the conference schedule or register to attend. To keep on top of all Enterprise Connect developments, subscribe to the weekly newsletter.

BCS_logo_100px_0.jpg

This post is written on behalf of BCStrategies, an industry resource for enterprises, vendors, system integrators, and anyone interested in the growing business communications arena. A supplier of objective information on business communications, BCStrategies is supported by an alliance of leading communication industry advisors, analysts, and consultants who have worked in the various segments of the dynamic business communications market. 

About the Author

Jon Arnold

Jon Arnold is Principal of J Arnold & Associates, an independent analyst providing thought leadership and go-to-market counsel with a focus on the business-level impact of digital transformation in the workplace. Core areas of expertise include unified communications, cloud services, collaboration, Internet of Things, future of work, contact centers, customer experience, video, VoIP, and social media.

 

He has been consulting in many of these areas since 2001, and his independent practice was founded in 2005. JAA is based in Toronto, Ontario, and serves clients across North America as well as in Europe.

 

Jon’s thought leadership can be followed on his widely-read JAA’s Analyst Blog, his monthly Communications and Collaboration Review, and ongoing commentary on Twitter and LinkedIn. His thought leadership is also regularly published across the communications industry, including here on No Jitter as well as on BCStrategies, Ziff Davis B2B/Toolbox.com, TechTarget and Internet Telephony Magazine.

 

In 2019, Jon was named a “Top 30 Contact Center Influencer,” and in 2018, Jon was included in a listing of “Top 10 Telecoms Influencers,” and “TOP VoIP Bloggers to Follow.” Previously, in both March 2017 and January 2016, Jon was cited among the Top Analysts Covering the Contact Center Industry. Also in 2017, Jon was cited as a Top 10 Telecom Expert, and Six Business Communications Thought Leaders to Follow. Before that, GetVoIP.com named Jon a Top 50 UC Experts to Follow in 2015, as well as a Top 100 Tech Podcaster in 2014. For JAA’s blog, it was recognized as a Top Tech Blog in 2016 and 2015, and has had other similar accolades going back to 2008.

 

Additionally, Jon is a UC Expert with BCStrategies, a long-serving Council Member with the Gerson Lehrman Group, speaks regularly at industry events, and accepts public speaking invitations. He is frequently cited in both the trade press and mainstream business press, serves as an Advisor to emerging technology/telecom companies, and is a member of the U.S.-based SCTC.