Sponsored By

Watson, Come Here -- I Want to See YouWatson, Come Here -- I Want to See You

A look at how IBM Watson works at the API level

Andrew Prokop

November 13, 2017

6 Min Read
No Jitter logo in a gray background | No Jitter

If you were to ask me what I want to do - I don't want to be a celebrity, I want to make a difference.
-- Lady Gaga

If you are like most people, the first time you heard about IBM Watson was on TV's Jeopardy. On February 14, 2011, Watson took on Jeopardy's two most successful contestants -- Ken Jennings and Brad Rutter -- to see who was smarter, man or machine. After two exciting matches, Watson bested the two and netted one million dollars (which IBM subsequently donated to charity).

While some might look upon this as merely entertainment, the fact that a computer could beat a human at this kind of game was revolutionary. Unlike computer vs. human chess tournaments, playing Jeopardy isn't simply a matter of investigating probable outcomes. Beyond the memorization of millions of facts, Jeopardy requires a thorough understanding of language, idioms, and intent. Watson not only proved that it could master the complexities of the English language, but it showed the world that it had a pretty quick trigger finger.

While Watson's days as a television celebrity might be over, its importance in advancing artificial intelligence and natural language processing has only grown since that auspicious debut. It has gone from an academic plaything, to a powerful business tool. A quick Internet search showed me that Watson has been adopted by companies as diverse as Macy's, H&R Block, and Chevrolet. It is being used to diagnose cancer and to improve social media reach. It thinks while we listen and act.

Natural Language Processing
You may recall my two previous No Jitter articles on natural language processing (NLP): Enhancing the Customer Experience with Natural Language Processing, Parts One and Two. In those pieces, I introduced you to the technology and many of its terms, and I also created a video that showed how NLP can be deployed using Facebook's wit.ai service.

In terms of NLP, Watson provides everything that wit.ai does, but it extends the kinds of data it returns many times over. You get the same "intents" and "entities" described in my two articles, along with sentiment, emotion, relations, keywords, and categories. Additionally, Watson will even pass back Web links appropriate to the analyzed text.

For example, a Watson analysis of the sentence "I like The Rolling Stones" informs me of the following:

  1. The language is English.

  2. The Rolling Stones are categorized as "arts, entertainment, and concerts."

  3. This is a positive statement with a modest amount of joy.

  4. There is no discernable anger or disgust.

  5. The action is "like," the object is "The Rolling Stones," and the subject is "I."

By simply substituting "hate" for "like," I see the tone switch from positive to negative, the joy completely disappear, and the anger significantly rise.

Talk to Me
In addition to NLP, Watson introduces the concept of a conversation. Not only does a conversation parse text as above, but it allows the application developer to create predefined responses for detected words and phrases, so that a conversation or interaction can be carried out.

For instance, I created a product order application with intents for arrival inquiries and delivery requests. When Watson encounters a phrase such as "When will my order arrive?" it not only tells me that the intent is #OrderInquiry, but also that a proper response would be "Your order was scheduled to ship on..." My application can query a database for the scheduled ship time to reply with "Your order was scheduled to ship on November 20."

While this example is quite simplistic, creating a pre-defined list of responses for any intent an application might encounter is a very powerful concept. It separates the process of retrieving data from the presentation of that data. If you are thinking text and chat bots, you are absolutely right.

Web Services
Like any modern cloud service worth its salt, Watson exposes a plethora of RESTful Web services. For my investigation, I dug into the following services:

  • Conversation

  • Language Translator

  • Natural Language Understanding

  • Tone Analyzer

  • Text to Speech

  • Speech to Text

Each service contains a number of APIs for configuration, processing, and data retrieval. Many APIs come in both a Get and Post form, and from what I saw, they all accept and return JSON.

Since I learn best by getting my hands dirty, I whipped up a Python application that allows me to explore each API in terms of its input, output, and actionable items. I will spare you the geeky analysis and simply say that I was able to get most of the functionality up and running in just a few minutes. However, for those programmer types out there, the following code is all that it takes to translate text into speech and save the output in a .wav file.

portable

I did the same for language detection, translation, NLP, and the previously mentioned conversation. With the exception of speech to text (which I am still struggling to get to work), it was pretty much cookie-cutter programming.

Watson is a Breeze
Once I understood how Watson works at the API level, I wanted to turn my curiosity into something more solution-based. So, moving from Python to Java, I created a series of Engagement Designer dynamic tasks that can be loaded into any Avaya Breeze workflow, as shown below. This took my efforts to an entirely new level. Not only could I send information into Watson for processing, I could use the analysis in conjunction with Breeze's existing communication, database, notification, and decision logic tasks. Simply put, I was able to create that text bot and begin working with Watson as more than just an exercise in programming.

portable

Once my software was inside Breeze, it quickly became apparent that the possible solutions built on Watson are nearly endless. Since a lot of my focus is in the contact center arena, I thought about how Watson could be used to detect the tone of a customer during an automated text chat and escalate it to a human being when Watson detects a concerning level of anger or frustration. Or perhaps I could use the NLP features of Watson to provide contact center agents with product or customer information in real time. Imagine Watson combined with my Breeze integration, connected to IoT devices and ServiceNow. The point is that these building blocks could be mixed and matched with out-of-the-box Breeze functionality in an infinite number of ways.

Mischief Managed
Artificial intelligence is going to alter the way we live, shop, communicate, solve problems, gain access to information, etc., in ways that are impossible to even fathom today. Watson isn't the only game in town, of course, but it is clearly a heavyweight in this space. It combines cutting-edge technology with APIs and application development tools so easy that even I can use them to create innovative solutions.

In my next installment, I will put this technology into action with a video that walks you through a Watson solution from idea to workflow. Please stay tuned for more fun and games.

Follow Andrew Prokop on Twitter and LinkedIn!
@ajprokop
Andrew Prokop on LinkedIn

About the Author

Andrew Prokop

Andrew Prokop has been involved in the world of communications since the early 1980s. He holds six United States patents in SIP technologies and was on the teams that developed Nortel's carrier-grade SIP soft switch and SIP-based contact center.

 

Through customer engagements, users groups, podcasts, proof-of-concept software development, trade-shows, and webinars, Andrew has been an evangelist for digital transformation technologies for enterprises and their customers. Andrew understands the needs of the enterprise and has the background and skills necessary to assist companies as they drive towards a world of dynamic and immersive communications.

 

Andrew is an active blogger and his widely read blog, Tao, Zen, and Tomorrow (formerly SIP Adventures) discusses every imaginable topic in the world of unified communications. He is just as comfortable writing at the 50,000 foot level as he is discussing natural language processing or the subtle nuances of a particular SIP header.