Sponsored By

Let Your Bots Do the TalkingLet Your Bots Do the Talking

AudioCode’s Voice.AI Gateway lets enterprises voice-enable bots and call them from any telephone, UC system, or WebRTC endpoint.

Andrew Prokop

August 21, 2019

5 Min Read
Chat bots

Words mean more than what is set down on paper. It takes the human voice to infuse them with deeper meaning.

-- Maya Angelou

 

It should come as no surprise that we’ve entered the age of bots. These intelligent, virtual assistants help us order food, request rides, lull us to sleep, and stay on top of the latest news stories. I’ve created prototype bots for healthcare, transportation, and parks and recreation. With recent advancements in speech recognition, natural language processing, machine learning, and speech synthesis, it’s becoming nearly impossible to know when you’re conversing with a person or a bot.

 

Reasons for the proliferation of bots are numerous, but high on the list is the desire for businesses to lower their costs by moving predictable, repetitive tasks away from live agents to lower-cost, always-on machines. While the average call center worker will quickly tire of answering the same questions over and over again, a bot’s cheerful demeanor never wilts or fades. Adding to that is the bot’s willingness to work nights, weekends, and holidays. A bot never calls in sick or comes to work with a bad attitude.

 

Regardless of how sophisticated bots get, there’ll always be a need for a living, breathing person to tackle complex problems. It’s easy to recite a hospital’s pharmacy hours, but it’s much harder to triage a potentially serious health problem.

 

Despite speculation about the death of voice calls, 39% of 5,000 consumers Microsoft surveyed globally for its 2018 State of Global Customer Service report rank the telephone as their number one communications channel for customer service. And yet, the majority of bots implemented today are strictly text-based. This presents a disconnect. Customers are either left to use a communication method — text — that’s not as comfortable to them as a phone call, or contact center agents get tied up answering calls on queries that are best suited for bots.

 

The AudioCodes Voice.AI Gateway

AudioCodes recognized this conundrum and concluded that there’s no good reason why old fashioned telephone calls can’t reap the many benefits that bots provide while enabling an agent to step in when a bot’s capability has been exceeded.

 

With AudioCode’s Voice.AI Gateway, an enterprise can apply the same technologies that it has deployed for text bots (SMS, Webchat, Facebook Messenger, etc.) on incoming and outgoing telephone calls. This means that a bot developed using Google, Amazon, or Microsoft tools can be voice-enabled and called from any telephone, unified communications system, or WebRTC endpoint in the world.

 

The architecture of a Voice.AI Gateway is fairly straightforward. It sits between the voice, bot, and cognitive services worlds, as shown below. It uses standard SIP to communicate with carriers, contact centers, and enterprise UC systems, and Web services to connect bots to Web platforms. As is the nature of any gateway, it forms the bridge between two disparate technologies that are otherwise incompatible.

 

Administrators can select between different text-to-speech, speech-to-text, and bot frameworks as they see fit for their use case. For instance, one text synthesis platform may be better than another for particular languages or dialects. The Voice.AI gateway allows for a best-of-breed mixing and matching.

 

Andrew_image1_82119.png

 

When I first looked at the Voice.AI platform, it reminded me of a session border controller (SBC). AudioCodes told me that’s because its SBC technology is at the core of the gateway. That shouldn’t come as a surprise since an SBC’s job is to connect IP calls with IP services. Having SIP calls “talk” to bot services in the cloud — Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP) — is simply the next logical step in unifying communications.

 

The benefits of this unified approach are many:

  • Bring the most intuitive form of human communications (voice) into a bot service

  • Maintain existing bot user flows and scripts

  • Easily migrate bots onto voice engagement channels using a single solution

  • Avoid complex integrations with voice networks by utilizing the voice communications capabilities embedded in the Voice.AI Gateway

  • Connect to any third-party bot, speech-to-text, or text-to-speech service (Azure, AWS, GCP, etc.)

  • Supports best-of-breed selection of bot frameworks and cognitive voice services

  • Advanced call management (disconnect, transfer to agent, call recording, etc.)

A step-by-step voice-enabled bot flow looks like this:

  1. The customer poses a question.

  2. The Voice.AI Gateway streams the customer’s voice is streamed to a speech recognition service.

  3. The speech recognition service returns a text translation of the customer’s question.

  4. The gateway sends the text to the bot platform (Google, Amazon, Microsoft, etc.).

  5. The bot replies with an answer.

  6. The gateway sends the textual reply to the text-to-speech engine for conversion to speech.

  7. The engine returns the speech to the gateway.

  8. The customer hears the bot’s answer.

Andrew_image2_82119.png

From the bot’s standpoint, it has no idea if it’s communicating with a landline, a cell phone, a WebRTC-enabled browser, or Facebook Messenger. In AI terminology, the bot’s entities, dialogs, and intents are the same. This allows bot developers to “write once and deploy many.”

 

As previously stated, a bot has its sweet spot, and there are times when a call needs to be escalated to a live agent. Since at its core the Voice.AI Gateway is an SBC, transferring to an agent is as simple as redirecting the call to an enterprise’s contact center. This allows for a seamless transition from bot-assisted to human-assisted customer support. While not available in the first release of the Voice.AI Gateway, attaching a transcript of the customer-to-bot conversation to the call is on the roadmap.

 

Mischief Managed

As a proponent and evangelist of digital transformation, I’m all in on artificial intelligence and bots. As a pragmatic geek, I understand that telephones and the voice network aren’t going away any time soon. While there are times when my desired mode of communication is a keyboard (physical or virtual), there are just as many times when I need to speak my piece out loud. The AudioCodes Voice.AI Gateway allows both technologies to coexist in perfect harmony.

About the Author

Andrew Prokop

Andrew Prokop has been involved in the world of communications since the early 1980s. He holds six United States patents in SIP technologies and was on the teams that developed Nortel's carrier-grade SIP soft switch and SIP-based contact center.

 

Through customer engagements, users groups, podcasts, proof-of-concept software development, trade-shows, and webinars, Andrew has been an evangelist for digital transformation technologies for enterprises and their customers. Andrew understands the needs of the enterprise and has the background and skills necessary to assist companies as they drive towards a world of dynamic and immersive communications.

 

Andrew is an active blogger and his widely read blog, Tao, Zen, and Tomorrow (formerly SIP Adventures) discusses every imaginable topic in the world of unified communications. He is just as comfortable writing at the 50,000 foot level as he is discussing natural language processing or the subtle nuances of a particular SIP header.