Chatbot Engine behind Alibaba’s AliMe Customer Service Bot

This article is part of the Academic Alibaba series and is taken from the paper entitled “AliMe: A Sequence-to-Sequence and Rerank-based Chatbot Engine”, accepted by ACL 2017. The full paper can be read here.

Image for post
Image for post

Chatbots are certainly having their moment. Amazon’s Alexa and Apple’s Siri lead a pack of thousands of chatbot applications being operated and developed by companies of all sizes. Unlike older, more rudimentary versions where users had to abide by simple and structured language, today’s chatbots allow users to employ natural text and speech (in some cases, even images) when interacting with them. They may not have reached the heights of Samantha in the film Her, but they are making major progress towards realizing that level of AI and AI interactions.

Introducing AliMe, Alibaba’s Chatbot

Most open-domain chatbots use information retrieval (IR) or generation models, both of which come with several drawbacks. IR models retrieve answers from question/answer knowledge bases, while generation models return answers based on pre-trained sequence-to-sequence (seq2seq) models. IR models fail spectacularly at dealing with long-tail queries that don’t match the QA knowledge base, and generation models don’t always return comprehensible or consistent answers.

More about AliMe: https://medium.com/mlreview/behind-the-chat-how-e-commerce-bot-alime-works-1b352391172a

How AliMe Differs and Holds Up

When faced with a query, the chatbot first uses an IR model to retrieve a set of Q/A pairs for likely answers, and then re-ranks these candidate answers using an attentive seq2seq model. When the top candidate answer has a score higher than the set threshold, it is chosen as the answer. If this proves inconclusive, the answer is then selected using a generation-based model.

For questions and answers of varying lengths, the Alibaba tech team chose the bucket mechanism first proposed by Tensorflow 1. To speed up the training process for the chatbot, they applied softmax to a set of sample vocabulary rather than an entire set — inspired by the strategy of importance sampling. In the decoding phase, they used beam search to maintain top-k (k=10) output sequences at each moment rather than going with greedy search, making generation more reasonable and consistent.

Image for post
Image for post
The AliMe hybrid approach

Already finding proof of its success in the experiment stage, the team tested AliMe against a known chatbot. Using a set of relevant testing questions, the AliMe chatbot displayed a better performance on 37.64% of the questions, and worse on 18.84%.

What Comes Next

Currently, Alibaba uses a straightforward strategy for incorporating content. When faced with a question, if the IR model pulls up less than three answer candidates, the model is supplemented with the previous question and the concatenation is sent to the IR engine again. Another method the tech team tried in the past involved using context-aware techniques such as context-sensitive and neural conversation models, but they tended to scale poorly for Alibaba’s scenarios. In the meantime, the team is continuing to explore scalable context-aware methods, as well as personification, making AliMe more engaging and relatable, demonstrating emotions and a personality.

Image for post
Image for post

The full paper can be read here.

Alibaba Tech

Written by

First-hand & in-depth information about Alibaba's tech innovation in Artificial Intelligence, Big Data & Computer Engineering. Follow us on Facebook!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store