Azure Content Safety Endpoint
Azure Content Safety Policy
Policy Name |
Azure Content Safety Policy |
Summary |
Evaluates LLM prompts and responses against Azure AI Content Safety for harmful content, jailbreak attempts, hallucinations, and copyrighted material |
Category |
LLM |
First Omni Gateway version available |
v1.13.0 |
Returned Status Codes |
403 - Forbidden: Content violates Azure Content Safety policies |
Summary
The Azure Content Safety policy provides comprehensive content moderation for LLM-based APIs by evaluating prompts and responses against the Azure AI Content Safety service. The policy integrates with Azure AI Content Safety to enforce content safety policies including:
-
Content filters: Detects and blocks harmful content across four categories (Hate, SelfHarm, Sexual, and Violence) with configurable severity thresholds (
0,2,4,6). -
Prompt Shield: Detects jailbreak attempts and indirect prompt-injection attacks using Azure’s advanced detection capabilities.
-
Blocklists: Blocks content matching Azure-managed custom blocklists you define in Azure Content Safety Studio.
-
Groundedness Detection: Evaluates whether LLM responses are grounded in provided reference text (hallucination detection for RAG applications).
-
Protected Material Detection: Detects known copyrighted text in LLM responses.
The policy operates in two independent phases:
-
Request phase: Moderates user prompts before they reach the upstream LLM, preventing harmful or inappropriate prompts from being processed.
-
Response phase: Moderates LLM responses before they reach the client, ensuring outputs comply with safety policies. Streaming responses (
text/event-stream) aren’t moderated.
When content violates safety policies, the request is rejected with a 403 error code and never reaches the LLM (request phase) or the client receives a 403 instead of the LLM response (response phase).
Before You Begin
Before configuring this policy, you need:
-
Azure Account with access to Azure AI Content Safety
-
Azure Content Safety Resource created in the Azure Portal
-
Subscription Key from the Azure Portal (Keys and Endpoint section)
-
Resource Endpoint URL from the Azure Portal
Configuring Policy Parameters
Managed Omni Gateway and Omni Gateway Connected Mode
When you apply the policy from the UI, the following parameters are displayed:
Basic Configuration
| Element | Required | Description |
|---|---|---|
Yes |
Azure Content Safety endpoint base URL, for example, |
|
Azure API Key |
Yes |
Azure Content Safety subscription key. The key is passed in the |
Moderate Request |
No |
When enabled, evaluates the user prompt against Azure Content Safety before forwarding to the upstream LLM. Rejected prompts never reach the LLM. |
Moderate Response |
No |
When enabled, evaluates the LLM response against Azure Content Safety before returning to the client. Rejected responses return |
Default Severity Threshold |
No |
Severity threshold applied to all four harm categories (Hate, SelfHarm, Sexual, Violence). Content with severity at or above this value is rejected. Supported values are For details, see Severity Thresholds. |
Advanced Configuration
| Element | Required | Description |
|---|---|---|
API Version |
No |
Azure Content Safety API version used for all endpoint calls. Default: To enable Groundedness Detection, override this to a supported preview version, for example, |
Enable Prompt Shield |
No |
When enabled, calls |
Hate Severity Threshold |
No |
Per-category override for the Hate harm category. Leave as |
SelfHarm Severity Threshold |
No |
Per-category override for the SelfHarm harm category. Leave as |
Blocklist Names |
No |
Names of Azure-managed blocklists (created in Azure Content Safety Studio) to evaluate against the input text. |
Enable Groundedness Detection |
No |
When enabled, calls |
Grounding Source Selector |
No |
DataWeave expression resolving the grounding source text from the request body. Required when Groundedness Detection is enabled. For example: |
Enable Protected Material Detection |
No |
When enabled, calls |
API Timeout (ms) |
No |
Per-request timeout for each Azure Content Safety API call in milliseconds. Must be between 1000 and 30000. Default: |
Fail Open |
No |
Determines behavior when the Azure API call fails or times out:
|
How This Policy Works
The Azure Content Safety policy integrates with Azure AI Content Safety to evaluate LLM prompts and responses against configurable safety policies.
Request and Response Moderation
The policy supports independent evaluation for request and response:
-
Request Phase (when
moderateRequestis enabled):-
The policy extracts the user prompt from the request body.
-
The policy sends the prompt to Azure Content Safety APIs in parallel:
-
text:analyze— Evaluates harm categories (Hate, SelfHarm, Sexual, Violence) and checks blocklists -
text:shieldPrompt— Detects jailbreak attempts and indirect injections (whenenablePromptShieldis enabled)
-
-
If the prompt violates any policies, the policy blocks the request and returns a
403error code to the client. -
If the prompt passes, the policy forwards the original request to the upstream LLM.
If grounding source and query selectors are configured, the policy extracts them during request processing and stores the values for use in response processing.
-
-
Response Phase (when
moderateResponseis enabled):-
The policy intercepts the LLM response.
-
The policy sends the completion to Azure Content Safety APIs in parallel:
-
text:analyze— Evaluates harm categories and checks blocklists -
text:detectGroundedness— Scores the response against the grounding source (when enabled and grounding source is available) -
text:detectProtectedMaterial— Detects known copyrighted text (when enabled)
-
-
If the response violates any policies, the policy returns a
403error code to the client. -
If the response passes, the policy forwards the original response to the client.
Streaming responses ( text/event-stream) are skipped and pass through without moderation.
-
Groundedness Detection
Groundedness detection helps detect hallucinations by evaluating whether LLM responses are grounded in the provided reference text. This is particularly useful for RAG (Retrieval-Augmented Generation) applications.
To enable groundedness detection:
-
Enable groundedness detection in Advanced Configuration (
enableGroundednessDetection: true). -
Configure the Grounding Source Selector to extract the reference text from the request body.
-
Set the API Version to a preview version that supports the groundedness endpoint (for example,
2024-09-15-preview).
The grounding source selector is a DataWeave expression that extracts the reference text the LLM response should be based on (typically documents or context provided in the request).
advancedConfiguration:
apiVersion: "2024-09-15-preview"
enableGroundednessDetection: true
groundingSourceSelector: "#[payload.context.documents[0].content]"
Severity Thresholds
Azure AI Content Safety evaluates content across four harm categories and assigns a severity level to each:
-
0 — Safe (no harmful content detected)
-
2 — Low severity
-
4 — Medium severity
-
6 — High severity
The policy rejects content when the severity level is at or above the configured threshold:
| Threshold Value | Content Blocked |
|---|---|
0 |
Any flagged content (severity > 0) is blocked. This is the strictest setting. |
2 |
Content with low, medium, or high severity is blocked. This is the recommended default. |
4 |
Content with medium or high severity is blocked. This allows low-severity content. |
6 |
Only high-severity content is blocked. This is the most permissive setting. |
You can configure a default threshold that applies to all categories, and optionally override the threshold for specific categories.
Response Headers
Every moderated response includes observability headers:
| Header | Values | Description |
|---|---|---|
|
|
Final moderation decision. |
|
|
Which phase performed the moderation. Useful for understanding whether the prompt or response was blocked. |
|
|
Why the content was rejected. Multiple reasons are comma-separated if the content violated multiple policies. |
Response on Reject
When the policy blocks content, it returns a 403 response with this structure:
{
"error": "Content blocked by Azure Content Safety",
"categories": ["severity_hate", "blocklist"]
}
Error Handling
Fail Closed (Default)
When Fail Open is disabled (the default), any Azure API error or timeout results in rejection:
-
Request phase: HTTP
503with body{"error":"Azure Content Safety service unavailable"} -
Response phase: The response is rewritten to HTTP
503with the same body -
The
x-llm-proxy-azure-content-safety-reasonheader is set toservice_unavailable
Example Configurations
Minimal Configuration — Request and Response Moderation
- policyRef:
name: azure-content-safety-policy-v1-0-impl
namespace: default
config:
azureEndpoint: https://llmproxy-azure-cs.cognitiveservices.azure.com
azureApiKey: "${AZURE_CONTENT_SAFETY_KEY}"
Strict Threshold with Blocklists
This example blocks any flagged content and applies custom blocklists:
- policyRef:
name: azure-content-safety-policy-v1-0-impl
namespace: default
config:
azureEndpoint: https://llmproxy-azure-cs.cognitiveservices.azure.com
azureApiKey: "${AZURE_CONTENT_SAFETY_KEY}"
defaultSeverityThreshold: 0
advancedConfiguration:
blocklistNames:
- competitor-names
- internal-codenames
Per-Category Thresholds
This example applies different thresholds to different harm categories:
- policyRef:
name: azure-content-safety-policy-v1-0-impl
namespace: default
config:
azureEndpoint: https://llmproxy-azure-cs.cognitiveservices.azure.com
azureApiKey: "${AZURE_CONTENT_SAFETY_KEY}"
defaultSeverityThreshold: 2
advancedConfiguration:
selfHarmSeverityThreshold: "4" # Allow low-severity SelfHarm discussion
violenceSeverityThreshold: "0" # Block any flagged Violence
With Groundedness Detection (Hallucination Detection)
- policyRef:
name: azure-content-safety-policy-v1-0-impl
namespace: default
config:
azureEndpoint: https://llmproxy-azure-cs-eastus.cognitiveservices.azure.com
azureApiKey: "${AZURE_CONTENT_SAFETY_KEY}"
defaultSeverityThreshold: 4
advancedConfiguration:
apiVersion: "2024-09-15-preview"
enableGroundednessDetection: true
groundingSourceSelector: "#[payload.context.documents[0].content]"
enableProtectedMaterialDetection: true
blocklistNames:
- my-blocklist
apiTimeoutMs: 10000
failOpen: false



