Activate offensiveness protection for generative AI

Australia Enable AI

Release

australia

ft:locale

de-DE

ft:publication_title

Australia Enable AI

ft:clusterId

platai

bundleId

platai

workflow

Platform

Activate offensiveness protection for generative AI

Freigeben Version: Australia

Aktualisiert 12. März 2026

1 Minute Lesedauer

Turn on offensiveness protection to log and add the option to block offensive content in AI-generated text and conversations.

Vorbereitungen

Role required: sn_generative_ai.nsa_admin

Warum und wann dieser Vorgang ausgeführt wird

Generative AI is probabilistic, which means that outputs are based on probabilities, and using the same input twice does not guarantee the same output. Some of the material generated by AI could potentially be undesirable because of toxicity, sexism, or other offensive sentiment. Now Assist Guardian enables you to log any material that is detected to be offensive. If you choose, you can also block offensive material so that users don't see the generated content. Instead, they see a message stating that offensive material has been detected and blocked.

See Now Assist Guardian for more information.

Logs can be exported for review. For instructions on how to do so, see Export Now Assist Guardian logs.

Prozedur

Navigate to All > Now Assist Admin > Settings.
In the side panel, select the Now Assist Guardian > Offensiveness tab.
Go to the Available for you tab to see which workflows you can choose from.
If you have any offensiveness guardrails already activated, they appear in the Active tab.
Select Activate for the workflow that you want to enable offensiveness protection on.
Select your impact detection.
Now Assist Guardian logs when offensive content is detected or generated when offensiveness protection is activated. You can also choose whether you want to block the content from the user. If you choose to block the content, the user sees a standardized message explaining that offensive material has been blocked instead of what was generated.
Select Save.

Ergebnisse

Now Assist Guardian's offensiveness guardrail is enabled on your instance for the workflow you have selected.

Nächste Maßnahme

You can enable offensiveness protection for all Now Assist applications that you have enabled on your instance. If you want to change your detection impact, you can select more options () in the list of active workflows and choose Edit.

You can deactivate offensiveness protection for your workflow at any time by selecting more options and choosing Deactivate.