C

chAI - The Living Chatbot

Grab a cup of chAI!

The problem chAI - The Living Chatbot solves

chAI is an open-domain generative chatbot. It is built using a seq2seq model (also called an LSTM) to iteratively generate a response to every comment sent to it. The inspiration for this project was XiaoIce, a commercially successful chatbot operating in China with upwards of 600k users. In the spirit of Aatmanirbhar Bharat, we decided to create an Indian variant specialised for the Indian demographic.

There are several uses for such a product. It can be used in the Healthcare sector, either for helping in treatment of mental illnesses or as an aid in diagnosis/ treatment of patients. Chatbots are virtually poised to take over the Customer Service sector in its entirety anyways. It can even be used for purely Entertainment purposes, for killing time. This model can be marketed either as a standalone product or as an API, where new chatbots can use our code for general conversation while implementing their own for the specialisation. Advertising on such a bot would also be far more effective than most media, which could provide support for the product.

We know that chatbots themselves aren't a revolutionary concept, in fact even a lot of generative open-domain bots have been released by companies like Microsoft, Pandorabots, etc. However all of these failed due to a lack of a proper filter (which is hard to implement on an open-domain bot). Our idea is unique in it's implementation that it uses a dual layered architecture of Retrieval + Generative to identify intents and handle them accordingly.

Basically the way it works is that upon entering a comment, a Neural Network processes the statement for an intent (intents such as Happy, Sad, Political, Joke, etc.). Upon getting said intent, we identify if the statement is on a sensitive topic, and if it is, then we set the bot to not learn from this particular conversation. If not, it is then passed to the actual chatbot which generates a reply through a complex Neural Network.

Challenges we ran into

We faced several challenges while developing chAI (especially given the short duration we had to build it) right from planning the system to actually implementing it.

One important problem we realised early on was the sheer lack of time. Unlike most deep learning problems we actually had more than enough data to train our model but it was the time constraints that were the problem this time. A generative chatbot is a massive neural network and it wouldn't be strange for it to take several days to train. So in order to optimise this process we decided to filter the parent-reply responses by score to get the optimal pair for any reply.

Another problem was the iterative learning. We'd configured the model to not learn from sensitive topics but then how should it determine what to learn from? A simple solution would be to simply learn from every single conversational exchange with a user but this way it would never improve, just get stuck repeating a few phrases. So we decided to go for a two-pronged approach - Users can 'like' a comment which would subsequently be used in retraining the model and Users will be allowed to 'teach' the bot themselves with a preset phrase like 'say after me'.

Then we had some issues configuring our unique integration of the models . So we had a basic decision tree to set it up. If the Confidence of the Retrieval Model is above 80%, it will reply to the user, if the confidence is below that then the exchange is passed to the core Generative model. Further, after 2-3 exchanges with the retrieval control is always eventually passed to the generative bot. Also as unconditional override we set the Retrieval model to reply anytime a sensitive topic (eg. Politics, Identity Issues) is detected to avoid the bot from being corrupted by user pranks.

Discussion