Activate offensiveness protection for generative AI

  • Release version: Yokohama
  • Updated July 31, 2025
  • 1 minute to read
  • Activate offensiveness detection to log or block offensive content generated by Now Assist skills and workflows.

    Before you begin

    Role required: sn_generative_ai.nsa_admin

    About this task

    Generative AI output is probabilistic, which means that the same input can produce different outputs. Some of the AI generated content may be offensive, which includes toxic, sexist, or other harmful language. Now Assist Guardian enables you to detect offensive content in both inputs and outputs, and logs the event when it is detected. You can also configure it to block offensive material so that users see a standard error message instead of the generated response.
    Note:
    Offensiveness detection applies only to specific Now Assist skills and workflows. It is not available for all Now Assist applications. For more information about the list of skills that support offensiveness detection, see Now Assist Guardian.

    You can export logs for review. For more information, see Export Now Assist Guardian logs.

    Procedure

    1. Navigate to All > Now Assist Admin > Settings.
    2. In the side panel, select the Now Assist Guardian > Offensiveness tab.
    3. Go to the Available for you tab to see which workflows you can choose from.

      Offensiveness guardrails that are already activated appear in the Active tab.

    4. Select Activate for the workflow that you want to enable offensiveness detection.
    5. Select your impact detection.
      • Select Log only to record the events when offensive content is detected. The content is still shown to the user.
      • Select Block and log to record the event and prevents the content from being shown to the user. The user sees a standard error message instead.

    6. Select Save.

    Result

    Offensiveness detection guardrail is enabled on your instance for the selected workflow. Events are logged when offensive content is detected or generated.

    What to do next

    You can enable offensiveness detection for separately for each supported Now Assist application and workflow. Repeat this task for each workflow on which you want offensiveness protection enabled.

    To change the detection impact for an active workflow, select more options (More options icon.) icon in the list of active workflows and then select Edit.

    To deactivate offensiveness protection for a workflow, select more options (More options icon.) icon in the list of active workflows and then select Deactivate .