Test and validate

  • Release version: Australia
  • Updated March 26, 2026
  • 2 minutes to read
  • Summarize
    Summarized using AI
    This content was generated using new OpenAI-powered functionality. Results are provided on an as is basis and are not guaranteed to be accurate or complete.

    Summary of Test and validate

    This process ensures that your AI agent is fully ready for production deployment by validating both its functional performance and security configurations. Testing covers agent execution, access controls, automated evaluations, and reviewing security logs to confirm the agent operates correctly and securely before going live.

    Show full answer Show less

    Test agent execution

    Use the AI Agent Studio testing playground to manually run your agent with sample inputs. This helps verify that the agent completes its intended tasks, uses proper tools, and handles edge cases and failures as expected. Separate procedures apply for AI agents and agentic workflows.

    Test access controls

    Run access tests with different user roles to confirm that only authorized users can invoke the agent, and unauthorized users are correctly blocked. If access results are unexpected, review and adjust your ACL configurations, considering interactions across agent, workflow, and tool layers.

    Run automated evaluations

    After manual testing, perform automated evaluations using datasets of expected inputs and outputs to establish consistent quality benchmarks. Ensure the user executing these evaluations has the required roles to avoid false access failures.

    Review Guardian logs from testing

    Export and analyze Guardian logs generated during test runs to identify any sensitive content detections. Adjust blocking configurations if unexpected detections occur, such as overly broad filters or test inputs triggering offensive content detection, to ensure production readiness.

    Go-live validation gate

    Only proceed to production deployment when all the following are confirmed:

    • Agent execution tests successfully pass defined use cases.
    • Access control tests verify correct user permissions.
    • Automated evaluations meet your success criteria.
    • Guardian logs have been reviewed and blocking rules are appropriate.

    Next step

    Once all validation criteria are met, proceed with the go-live process and ongoing monitoring to maintain agent performance and security in production.

    Test your agent's execution and access controls, run automated evaluations, and review Guardian logs before approving the agent for production deployment.

    Testing validates both that your agent performs its intended task correctly and that your security configuration works as designed. Both dimensions must pass before you deploy to production.

    Test agent execution

    Use the testing playground in AI Agent Studio to run manual test executions against your agent using sample utterances. Verify that the agent completes its intended task, uses the correct tools, and handles edge cases and failure scenarios appropriately.

    Test access controls

    Verify that your ACL configuration works correctly by running access tests as different users. Confirm that users who should have access can invoke the agent, and users who should not have access cannot.

    If access test results are unexpected, review your ACL configuration. See Implement access control in Now Assist AI agents for details on how ACLs interact across the agent, workflow, and tool layers.

    Run automated evaluations

    Automated evaluations test your agent against a dataset of expected inputs and outputs, providing consistent, repeatable quality measurements. Run evaluations after manual testing is complete to establish a performance baseline before go-live. For details on this process, see Execute an agentic evaluation run.

    Important:
    The user running an automated evaluation must pass the ACLs of the agent and all agents in the agentic workflow. If the user does not have the required roles, the evaluation will report an access failure rather than an agent execution failure.

    Review Guardian logs from testing

    Export and review Now Assist Guardian logs from your test runs before going live. The logs show you what content Guardian detected during testing, which helps you decide whether your current blocking configuration is appropriate for production use. See .

    If you see unexpected detections in the logs, adjust your Guardian configuration before proceeding. Common causes include overly broad sensitive topic filters or test utterances that trigger offensiveness detection.

    Go-live validation gate

    Do not proceed to Go live and monitor until all of the following are true:

    • Agent execution tests pass for your defined use case scenarios.
    • Access control tests confirm that only intended users can invoke the agent.
    • Automated evaluations meet your defined success criteria threshold.
    • Guardian logs from testing have been reviewed and configuration is confirmed appropriate for production.

    Next step

    When all validation gate criteria are met, proceed to Go live and monitor.