Skip to main content

Test and Evaluate Agent Performance

After configuring your AI agent, the final step is to verify how it behaves in real conversations and evaluate its performance.


Test in Playground

Playground is a testing environment where you can simulate conversations with your AI agent before deploying it to production channels. It allows you to safely verify agent behavior, responses, and data collection without affecting real customers or live sessions.

To access Playground, open AI Agents > Playground from the left sidebar menu.

Use Playground testing to validate:

  • How the agent understands customer requests
  • Whether Knowledge Base and FAQ responses are triggered
  • How Conversation Results are collected
  • Whether workflow transitions occur correctly

Try interacting with the agent using different types of requests to understand how it behaves in realistic situations:

  • Product inquiries — ask about features, pricing, or availability
  • Knowledge Base questions — check how the agent answers using documentation
  • FAQ validation — confirm that common questions return quick and accurate responses
  • Data collection — provide contact details and verify that information is captured correctly
  • Edge cases — send unclear, incomplete, or unusual requests

Testing different scenarios helps reveal configuration issues before the agent interacts with real customers.

Test in Realistic Scenarios

Evaluate how the agent behaves under different interaction conditions. The goal is to observe how consistently the agent applies defined rules and tone.

Recommended sequence:

  1. Happy path — Straightforward, ideal input
  2. Ambiguous input — Unclear or partial requests
  3. Hostile input — Aggressive or frustrated tone
  4. Failure handling — Unavailable data or external errors

Continuous Improvement

AI agent configuration is an iterative process. After deployment, regularly review real conversations and refine agent settings based on observed behavior.

Improvement Cycle

Typical improvement cycle:

  1. Test agent behavior in Playground
  2. Review real conversations in Sessions, evaluating response consistency, handling time, escalation rate, and customer satisfaction
  3. Adjust Identity, Task, Knowledge, or Workflow settings
  4. Validate changes by retesting in Playground

Regular reviews help maintain stable and predictable agent performance over time.

Review Regularly

  • Weekly — review conversations for tone and accuracy
  • Monthly — analyze performance trends
  • Quarterly — remove outdated knowledge and workflows

Monitor Agent Behavior

After deployment, monitor real conversations to evaluate how your agent performs in production. Use the monitoring tools to review conversations, verify collected results, and investigate unexpected behavior.

Learn more about monitoring in the Monitoring & Analytics section.

Import and Export AI Agents

Export a configured AI agent

Steps:

  1. On the left sidebar menu, go to AI Agents > Agents section
  2. Click three-dot menu () > Export Agent
  3. Configure export options:
OptionDescription
Remove sensitive dataExcludes API keys and passwords
Force update Skill GroupsOverwrites existing groups during import
Force update Knowledge BasesOverwrites existing KBs during import
Force update integrationsOverwrites integration settings during import
  1. Click Start Export
  2. Downloading starts automatically
  3. After the file is downloaded click Copy to to clipboard to save your secret key. Then click Close

Export contents:

  • Agent configuration (Identity, Task, Workflow)
  • Data collection instructions and forms
  • Links to Knowledge Bases and Skill Groups
  • YAML/JSON restoration files

💡 Note: The secret key is required for importing the agent.

Import an exported AI agent

Steps:

  1. On the left sidebar menu, go to AI Agents > Agents
  2. Click Import Agent on the upper panel
  3. In upload dialog:
    • Enter Secret Key
    • Upload the agent's ZIP archive
  4. Click Start Import

⚠️ Warning: If sensitive data was removed during export, manually re-enter API keys and tokens. Existing Skill Groups and Knowledge Bases won't be overwritten unless Force update was enabled.