Two weeks ago I attended API Strat in Boston where I gave a talk on cognitive APIs and conversational interfaces and showed and explained an e-commerce chatbot that I built. My presentation is on slideshare. I have learned a lot about chatbots and now I feel an urge to write about it.

Intents

My bot is using the intent dialog from the Microsoft Bot Framework:

The intent dialog associates a user’s intent like Explore or Checkout with a specific dialog that knows how to respond.

It feels very much like routing in a web framework where given a specific URL pattern, the request will be routed to a controller that knows how to handle it.

Users don’t spell out their intents like that though. And so the first thing my bot needs to do is to learn to recognize them. The simplest way to trigger a dialog handler in response to a users utterance is by matching it with a regex. A more sophisticated logic requires an intent recognizer.

Intent Recognizers

An intent recognizer is basically a service that can understand users’ utterances. Given a text message it will return a list of intents that it inferred from it along with supporting entities. Here’s how it looks in LUIS (language understanding service from Microsoft):

The Explore intent was recognized along with two supporting entities that I trained it for. Here’s another way of looking at it:

And the response:

Microsoft Bot Framework comes with built-in support for LUIS in the form of LuisRecognizer

Custom Recognizers

Not every thing your users say has to be sent to a natural language service to extract the intent. Buttons and tappable images can post back bot-specific commands like /show:123456789, for example, that you can easily recognize with a regex. Also, if you want your bot to smile back at a smile sent to it, you don’t need to train a linguistic model either.

It turns out, building your own recognizer is not hard at all. I have built a few for my e-commerce bot and here’s how it works.

First, know that the Bot Framework supports sending a message through a number of recognizers at the same time. You can chain them or run them all in parallel:

The recognizer itself is a very simple interface with only one method - recognize. Here’s how you would detect a smile, for example:

And here’s another one that understands commands:

That’s it for now but there is more to come. Stay tuned!