Test and validate

Release version: Australia

Updated March 26, 2026

2 minutes to read

Summarize

Summarized using AI

Summary of Test and validate

This process ensures that your AI agent is fully ready for production deployment by validating both its functional performance and security configurations. Testing covers agent execution, access controls, automated evaluations, and reviewing security logs to confirm the agent operates correctly and securely before going live.

Show full answer Show less

Test agent execution

Use the AI Agent Studio testing playground to manually run your agent with sample inputs. This helps verify that the agent completes its intended tasks, uses proper tools, and handles edge cases and failures as expected. Separate procedures apply for AI agents and agentic workflows.

Test access controls

Run access tests with different user roles to confirm that only authorized users can invoke the agent, and unauthorized users are correctly blocked. If access results are unexpected, review and adjust your ACL configurations, considering interactions across agent, workflow, and tool layers.

Run automated evaluations

After manual testing, perform automated evaluations using datasets of expected inputs and outputs to establish consistent quality benchmarks. Ensure the user executing these evaluations has the required roles to avoid false access failures.

Review Guardian logs from testing

Export and analyze Guardian logs generated during test runs to identify any sensitive content detections. Adjust blocking configurations if unexpected detections occur, such as overly broad filters or test inputs triggering offensive content detection, to ensure production readiness.

Go-live validation gate

Only proceed to production deployment when all the following are confirmed:

Agent execution tests successfully pass defined use cases.
Access control tests verify correct user permissions.
Automated evaluations meet your success criteria.
Guardian logs have been reviewed and blocking rules are appropriate.

Next step

Once all validation criteria are met, proceed with the go-live process and ongoing monitoring to maintain agent performance and security in production.

Test your agent's execution and access controls, run automated evaluations, and review Guardian logs before approving the agent for production deployment.

Testing validates both that your agent performs its intended task correctly and that your security configuration works as designed. Both dimensions must pass before you deploy to production.

Test agent execution

Use the testing playground in AI Agent Studio to run manual test executions against your agent using sample utterances. Verify that the agent completes its intended task, uses the correct tools, and handles edge cases and failure scenarios appropriately.

To test an AI agent execution, see Manually test the execution of an AI agent.
To test an agentic workflow execution, see Manually test the execution of an agentic workflow.

Test access controls

Verify that your ACL configuration works correctly by running access tests as different users. Confirm that users who should have access can invoke the agent, and users who should not have access cannot.

To test user access to an AI agent, see Test user access to an AI agent.
To test user access to an agentic workflow, see Test user access to an agentic workflow.

If access test results are unexpected, review your ACL configuration. See Implement access control in Now Assist AI agents for details on how ACLs interact across the agent, workflow, and tool layers.

Run automated evaluations

Automated evaluations test your agent against a dataset of expected inputs and outputs, providing consistent, repeatable quality measurements. Run evaluations after manual testing is complete to establish a performance baseline before go-live. For details on this process, see Execute an agentic evaluation run.

Important:

The user running an automated evaluation must pass the ACLs of the agent and all agents in the agentic workflow. If the user does not have the required roles, the evaluation will report an access failure rather than an agent execution failure.

Review Guardian logs from testing

Export and review Now Assist Guardian logs from your test runs before going live. The logs show you what content Guardian detected during testing, which helps you decide whether your current blocking configuration is appropriate for production use. See .

If you see unexpected detections in the logs, adjust your Guardian configuration before proceeding. Common causes include overly broad sensitive topic filters or test utterances that trigger offensiveness detection.

Go-live validation gate

Do not proceed to Go live and monitor until all of the following are true:

Agent execution tests pass for your defined use case scenarios.
Access control tests confirm that only intended users can invoke the agent.
Automated evaluations meet your defined success criteria threshold.
Guardian logs from testing have been reviewed and configuration is confirmed appropriate for production.

Next step

When all validation gate criteria are met, proceed to Go live and monitor.