Testing intelligent virtual assistants (IVA’s) to iterate and improve the NLU model is a critical step in the virtual assistant life cycle. Before you launch your virtual assistant to production, before folks are actually using it, you want to make sure that it works. It’s pretty simple, but testing is so important.
There are a variety of ways in which you can go about your testing because ultimately when you do get to production you want to deliver an experience that people love and really enjoy using.
Step One – Utterance Testing
The first step is utterance testing. Utterance testing involves finding out if the virtual assistant understands and identifies the core intent of an utterance. An utterance is what a customer says to the AI bot through writing or speaking.
Is the bot capturing the correct entities? Does it do what you want for that single utterance? That is the bare bones minimum.
Step Two – Batch Testing
The second step in testing your virtual assistant is batch testing, or collecting a significant amount of test cases. Machine learning models (MLM’s) are typically trained on hundreds to thousands of different utterances. Over time what you want to do is build out your testing suite so that you can perform batch tests where you have tens, hundreds, or thousands of different ways that somebody could ask a question of the virtual assistant. Batch testing checks if there are any gaps in the virtual assistant’s understanding.
Once you’ve identified the gaps, then you can go through and make sure that training data is added to retrain your virtual assistant so that it’s getting smarter over time. This will be explored in another blog.
The next step is conversation testing.
Step Three – Conversation Testing
Conversation testing involves picking a channel on which you’re going to interact with the virtual assistant and testing the entire conversational experience. It could be a chat, a voice channel, a mobile app, or a home speaker experience. Whichever channel it is you will be viewing an entire conversational flow that you’re going to have with that virtual assistant.
So conversation testing is focused on the entire conversational experience. How does it look, how does it feel as you’re interacting with this virtual assistant? Is it a good experience or does it need improvement?
Step Four – Testing With Pilot Programs
Finally, one of the best ways to test your virtual assistant is through pilot programs. An example of a pilot program would be launching your virtual assistant by giving access to it by 50 or 100 people. Call it a focus group. Perhaps they are friends and family, maybe it’s all of the people that you know. The testing cohort could just be from among the employees at your company. You select a small number of users and you give them access to it and then you let them run with it.
Once people have been using the virtual assistant you’re going to gather data so that you can understand where it’s working well and where it’s not working well; what does this overall experience look like? You can then take this data and use it to improve your virtual assistant over time.
By using this combination to test our intelligent virtual assistants, we make sure that it’s delivering on that phenomenal experience once you get to production.
Ready to test your own bots and want to reference a detailed step-by-step guide? Check out our bot documentation on testing and debugging for the full, in-depth testing process.
Want to Learn More?
We’re here to support your learning journey. Ready to take on bot building but not sure where to start? Learn conversational AI skills and get certified on Kore.ai Experience Optimization (XO) Platform.
As a leader in conversational AI platforms and solutions, Kore.ai helps enterprises automate front and back-office business interactions to deliver extraordinary experiences for their customers, agents, and employees.
Why not try one of our pre-trained virtual assistants? Get started with self-service automation across voice and digital channels today for your customers and employees alike.