Test and validate
Summarize
Summary of Test and validate
This process ensures that your AI agent is fully ready for production deployment by validating both its functional performance and security configurations. Testing covers agent execution, access controls, automated evaluations, and reviewing security logs to confirm the agent operates correctly and securely before going live.
Show less
Test agent execution
Use the AI Agent Studio testing playground to manually run your agent with sample inputs. This helps verify that the agent completes its intended tasks, uses proper tools, and handles edge cases and failures as expected. Separate procedures apply for AI agents and agentic workflows.
Test access controls
Run access tests with different user roles to confirm that only authorized users can invoke the agent, and unauthorized users are correctly blocked. If access results are unexpected, review and adjust your ACL configurations, considering interactions across agent, workflow, and tool layers.
Run automated evaluations
After manual testing, perform automated evaluations using datasets of expected inputs and outputs to establish consistent quality benchmarks. Ensure the user executing these evaluations has the required roles to avoid false access failures.
Review Guardian logs from testing
Export and analyze Guardian logs generated during test runs to identify any sensitive content detections. Adjust blocking configurations if unexpected detections occur, such as overly broad filters or test inputs triggering offensive content detection, to ensure production readiness.
Go-live validation gate
Only proceed to production deployment when all the following are confirmed:
- Agent execution tests successfully pass defined use cases.
- Access control tests verify correct user permissions.
- Automated evaluations meet your success criteria.
- Guardian logs have been reviewed and blocking rules are appropriate.
Next step
Once all validation criteria are met, proceed with the go-live process and ongoing monitoring to maintain agent performance and security in production.
Test your agent's execution and access controls, run automated evaluations, and review Guardian logs before approving the agent for production deployment.
Testing validates both that your agent performs its intended task correctly and that your security configuration works as designed. Both dimensions must pass before you deploy to production.
Test agent execution
Use the testing playground in AI Agent Studio to run manual test executions against your agent using sample utterances. Verify that the agent completes its intended task, uses the correct tools, and handles edge cases and failure scenarios appropriately.
- To test an AI agent execution, see Manually test the execution of an AI agent.
- To test an agentic workflow execution, see Manually test the execution of an agentic workflow.
Test access controls
Verify that your ACL configuration works correctly by running access tests as different users. Confirm that users who should have access can invoke the agent, and users who should not have access cannot.
- To test user access to an AI agent, see Test user access to an AI agent.
- To test user access to an agentic workflow, see Test user access to an agentic workflow.
If access test results are unexpected, review your ACL configuration. See Implement access control in Now Assist AI agents for details on how ACLs interact across the agent, workflow, and tool layers.
Run automated evaluations
Automated evaluations test your agent against a dataset of expected inputs and outputs, providing consistent, repeatable quality measurements. Run evaluations after manual testing is complete to establish a performance baseline before go-live. For details on this process, see Execute an agentic evaluation run.
Review Guardian logs from testing
Export and review Now Assist Guardian logs from your test runs before going live. The logs show you what content Guardian detected during testing, which helps you decide whether your current blocking configuration is appropriate for production use. See .
If you see unexpected detections in the logs, adjust your Guardian configuration before proceeding. Common causes include overly broad sensitive topic filters or test utterances that trigger offensiveness detection.
Go-live validation gate
Do not proceed to Go live and monitor until all of the following are true:
- Agent execution tests pass for your defined use case scenarios.
- Access control tests confirm that only intended users can invoke the agent.
- Automated evaluations meet your defined success criteria threshold.
- Guardian logs from testing have been reviewed and configuration is confirmed appropriate for production.
Next step
When all validation gate criteria are met, proceed to Go live and monitor.