LiteLLM Guardrails
Overview
Fiddler implements the LiteLLM Generic Guardrail API spec, allowing you to plug Fiddler’s guardrails directly into a LiteLLM proxy gateway. Once configured, every LLM request routed through the proxy is checked by Fiddler before it reaches the model.
Fiddler checks for:
- Secrets — API keys, tokens, and credentials in prompts
- PII — Personal identifiable information (names, emails, phone numbers, SSNs, etc.)
Each check can independently block or redact — see Check Behavior for details.
How It Works
LiteLLM calls the Fiddler endpoint at the pre_call stage with the extracted text from the request. Fiddler returns one of three actions:
| Action | Meaning | LiteLLM behavior |
|---|
NONE | No issues detected | Forwards the request unchanged |
GUARDRAIL_INTERVENED | Sensitive content redacted | Forwards the request with redacted texts |
BLOCKED | Request must not proceed | Returns an error to the client |
Checks run at the pre_call stage, on the request before it reaches the model. Response-side (post_call) scanning is not yet supported — configure mode: [pre_call].
Quick Start
Add the Fiddler guardrail to your LiteLLM config.yaml:
guardrails:
- guardrail_name: "fiddler"
litellm_params:
guardrail: generic_guardrail_api
mode: [pre_call]
api_base: https://<your-fiddler-instance>/v3/guardrails/litellm
default_on: true
headers:
Authorization: "Bearer <your-fiddler-api-key>"
additional_provider_specific_params:
pii:
enabled: true
config:
threshold: 0.8
secrets:
enabled: true
config: {}
LiteLLM appends /beta/litellm_basic_guardrail_api to api_base automatically. The full endpoint called will be https://<your-fiddler-instance>/v3/guardrails/litellm/beta/litellm_basic_guardrail_api.
Step 2: Start the proxy
litellm --config config.yaml --port 4000
Step 3: Verify
Send a request with a known secret:
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <litellm-master-key>" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"My API key is sk-ant-api03-abcdefghijklmnopqrstu"}]}'
The secret will be redacted to [REDACTED ANTHROPIC_API_KEY] before the request reaches the model.
Check Behavior
Each check has two configurable aspects:
- Mode (block / redact / off) — controls what happens when a detection is found. Set server-side via environment variables on the Fiddler deployment.
- Threshold and entities — controls detection sensitivity. Set as proxy-wide defaults via
additional_provider_specific_params in the LiteLLM config.yaml.
If any of pii or secrets is specified in additional_provider_specific_params, only the checks explicitly listed with enabled: true will run — regardless of server-side environment variable settings. To tune one check without silently disabling the other, list both keys with explicit enabled: true or enabled: false.
Secrets
Detects credentials, API keys, and tokens.
Default mode: redact (set via GUARDRAILS_SECRETS_MODE on the Fiddler server)
| Mode | Behavior in free-text messages | Behavior in tool call arguments |
|---|
redact (default) | Value replaced with [REDACTED <type>], request forwarded | Always blocked — cannot safely redact structured JSON |
block | Request blocked entirely | Request blocked |
off | Check skipped | Check skipped |
PII
Detects personal identifiable information.
Default mode: redact (set via GUARDRAILS_PII_MODE on the Fiddler server)
| Mode | Behavior in free-text messages | Behavior in tool call arguments |
|---|
redact (default) | Value replaced with [REDACTED <type>], request forwarded | Always blocked — cannot safely redact structured JSON |
block | Request blocked entirely | Request blocked |
off | Check skipped | Check skipped |
Entities checked by default include: person, email, phone number, social security number, credit card number, bank account number, passport number, driver's license number, date of birth, address, ip address, iban, cvv, cvc, tax identification number, digital signature, license plate number, postal code, and more. For the complete list see the PII Detection tutorial.
Custom entities and threshold can be configured as proxy-wide defaults via additional_provider_specific_params in your LiteLLM config.yaml:
additional_provider_specific_params:
pii:
enabled: true
config:
threshold: 0.8 # detection confidence threshold (default: 0.8)
entities: # optional: override default entity list
- person
- email
- phone number
When an LLM’s tool_calls contain PII or secrets, Fiddler always blocks rather than redacts.
Why: Tool call arguments are structured JSON that the downstream application will parse and execute. Replacing a value like an email address with [REDACTED EMAIL_ADDRESS] would cause send_email to attempt delivery to a nonsensical address — producing unpredictable behavior that is worse than blocking outright.
Example — blocked tool call:
{
"role": "assistant",
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "send_email",
"arguments": "{\"to\": \"robin@example.com\", \"body\": \"Phone: 9916308047\"}"
}
}]
}
Fiddler returns:
{"action": "BLOCKED", "blocked_reason": "PII or secrets detected in tool call arguments. Cannot redact tool call arguments — blocking request."}
LiteLLM then translates this into an error for the client (HTTP 400 on LiteLLM ≥ 1.87.0, HTTP 500 on earlier versions).
API Reference
Endpoint
POST /v3/guardrails/litellm/beta/litellm_basic_guardrail_api
Authentication: Authorization: Bearer <your-fiddler-api-key>
Request
{
"texts": ["string"],
"tool_calls": [{"id": "...", "type": "function", "function": {"name": "...", "arguments": "..."}}],
"tools": [{"type": "function", "function": {"name": "get_weather", "description": "Get current weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}}}],
"structured_messages": [{"role": "user", "content": "..."}],
"images": ["base64-encoded-string"],
"request_data": {"user_api_key_alias": "my-key", "user_api_key_team_id": "team-123"},
"request_headers": {"content-type": "application/json"},
"input_type": "request",
"additional_provider_specific_params": {
"pii": {"enabled": true, "config": {"threshold": 0.8}},
"secrets": {"enabled": true, "config": {}}
},
"litellm_call_id": "uuid",
"litellm_trace_id": "uuid",
"litellm_version": "1.87.0"
}
Only texts and tool_calls[].function.arguments are scanned by guardrail checks. The remaining fields are accepted for LiteLLM protocol compatibility.
| Field | Type | Required | Description |
|---|
texts | string[] | Yes | Extracted text strings from the request messages (scanned) |
request_data | object | Yes | LiteLLM virtual key metadata (user, team, org, key hash) |
input_type | "request" | "response" | Yes | Whether this is a pre-call or post-call check |
tool_calls | object[] | null | No | Tool invocations in OpenAI format (arguments are scanned) |
tools | object[] | null | No | Tool definitions in OpenAI format (not scanned) |
structured_messages | object[] | null | No | Full messages array in OpenAI format (not scanned) |
images | string[] | null | No | Base64-encoded images (not scanned) |
request_headers | object | null | No | Inbound request headers (not scanned) |
additional_provider_specific_params | object | null | No | Per-check configuration (see Quick Start) |
litellm_call_id | string | null | No | LiteLLM call ID for tracing |
litellm_trace_id | string | null | No | LiteLLM trace ID for tracing |
litellm_version | string | null | No | LiteLLM library version |
Response
Fields with null values are omitted from the wire (exclude_none=True). The response shape varies by action:
// action: NONE
{"action": "NONE"}
// action: BLOCKED
{"action": "BLOCKED", "blocked_reason": "PII or secrets detected in tool call arguments. Cannot redact tool call arguments — blocking request."}
// action: GUARDRAIL_INTERVENED
{"action": "GUARDRAIL_INTERVENED", "texts": ["My key is [REDACTED ANTHROPIC_API_KEY]"]}
| Field | Type | Description |
|---|
action | string | One of NONE, BLOCKED, or GUARDRAIL_INTERVENED |
blocked_reason | string | Human-readable reason; present only when action is BLOCKED |
texts | string[] | Redacted texts; present only when action is GUARDRAIL_INTERVENED |
Observed Behavior
Minimum required version: LiteLLM ≥ 1.87.0. Earlier versions return HTTP 500 for all guardrail blocks due to a bug in GuardrailRaisedException (missing status_code attribute). The Fiddler team identified and fixed this upstream in BerriAI/litellm#27617. On 1.87.0+, blocked requests correctly return HTTP 400.
| Scenario | Action | HTTP status | Notes |
|---|
| Secret in user message | GUARDRAIL_INTERVENED | 200 | Redacted in-place; LLM receives sanitized text |
| PII in tool call arguments | BLOCKED | 400 (500 pre-1.87) | Cannot redact structured JSON |
| Guardrail service unreachable | Pass-through | 200 | LiteLLM fails open by default |