Towards Multi-Platform Mutation Testing of Task-based Chatbots

Diego Clerissi, Elena Masserini, Daniela Micucci, Leonardo Mariani

Published: 2025/9/1

Abstract

Chatbots, also known as conversational agents, have become ubiquitous, offering services for a multitude of domains. Unlike general-purpose chatbots, task-based chatbots are software designed to prioritize the completion of tasks of the domain they handle (e.g., flight booking). Given the growing popularity of chatbots, testing techniques that can generate full conversations as test cases have emerged. Still, thoroughly testing all the possible conversational scenarios implemented by a task-based chatbot is challenging, resulting in incorrect behaviors that may remain unnoticed. To address this challenge, we proposed MUTABOT, a mutation testing approach for injecting faults in conversations and producing faulty chatbots that emulate defects that may affect the conversational aspects. In this paper, we present our extension of MUTABOT to multiple platforms (Dialogflow and Rasa), and present experiments that show how mutation testing can be used to reveal weaknesses in test suites generated by the Botium state-of-the-art test generator.

Read Full Paper (arXiv.org)