<ms-inference:toxicity-detection-text
doc:name="Toxicity detection text"
doc:id="b5770a5b-d3f9-47ba-acec-ab0bd41e4188"
config-ref="OpenAIConfig">
<ms-inference:text>
<![CDATA[You are fat]]>
</ms-inference:text>
</ms-inference:toxicity-detection-text>
Configuring Moderation Operations
Configure the [Toxicity] Detection by Text operation.
Configure the Toxicity Detection by Text Operation
The [Toxicity] Detection by Text operation classifies and scores any harmful content by the user or the LLM.
Apply the [Toxicity] Detection by Text operation in various scenarios, such as for:
-
Toxic Inputs Detection
Detect and block toxic input by the user to prevent sending it to the LLM.
-
Harmful Responses Detection
Filter out LLM responses that could be considered toxic or offensive by users.
To configure the [Toxicity] Detection by Text operation:
-
Select the operation on the Anypoint Code Builder or Studio canvas.
-
In the General properties tab for the operation, enter these values:
-
Text
Text to check for harmful content.
-
-
In the Additional Request Attributes (optional) field, you can pass any additional request attributes in the request payload to generate more relevant and precise LLM outputs.
This is the XML for this operation:
Output Configuration
This operation responds with a JSON payload containing the toxicity detection and rating. This is an example response:
{
"payload": {
"flagged": true,
"categories": [
{
"illicit/violent": 0.0000025466403947055455,
"self-harm/instructions": 0.00023480495744356635,
"harassment": 0.9798945372458964,
"violence/graphic": 0.000005920916517463734,
"illicit": 0.000013552078562406772,
"self-harm/intent": 0.0002233150331012493,
"hate/threatening": 0.0000012029639084557005,
"sexual/minors": 0.0000024300240743279605,
"harassment/threatening": 0.0007499928075102617,
"hate": 0.00720390551996062,
"self-harm": 0.0004822186797755494,
"sexual": 0.00012644219446392274,
"violence": 0.0004960569708019355
}
]
}
}