Skip to main content

LiteLLM Guardrails

Overview

Fiddler implements the LiteLLM Generic Guardrail API spec, allowing you to plug Fiddler’s guardrails directly into a LiteLLM proxy gateway. Once configured, every LLM request routed through the proxy is checked by Fiddler before it reaches the model. Fiddler checks for:
  • Secrets — API keys, tokens, and credentials in prompts
  • PII — Personal identifiable information (names, emails, phone numbers, SSNs, etc.)
Each check can independently block or redact — see Check Behavior for details.

How It Works

LiteLLM calls the Fiddler endpoint at the pre_call stage with the extracted text from the request. Fiddler returns one of three actions:
ActionMeaningLiteLLM behavior
NONENo issues detectedForwards the request unchanged
GUARDRAIL_INTERVENEDSensitive content redactedForwards the request with redacted texts
BLOCKEDRequest must not proceedReturns an error to the client
Checks run at the pre_call stage, on the request before it reaches the model. Response-side (post_call) scanning is not yet supported — configure mode: [pre_call].

Quick Start

Step 1: Configure LiteLLM

Add the Fiddler guardrail to your LiteLLM config.yaml:
guardrails:
  - guardrail_name: "fiddler"
    litellm_params:
      guardrail: generic_guardrail_api
      mode: [pre_call]
      api_base: https://<your-fiddler-instance>/v3/guardrails/litellm
      default_on: true
      headers:
        Authorization: "Bearer <your-fiddler-api-key>"
      additional_provider_specific_params:
        pii:
          enabled: true
          config:
            threshold: 0.8
        secrets:
          enabled: true
          config: {}
LiteLLM appends /beta/litellm_basic_guardrail_api to api_base automatically. The full endpoint called will be https://<your-fiddler-instance>/v3/guardrails/litellm/beta/litellm_basic_guardrail_api.

Step 2: Start the proxy

litellm --config config.yaml --port 4000

Step 3: Verify

Send a request with a known secret:
curl -i http://localhost:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <litellm-master-key>" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"My API key is sk-ant-api03-abcdefghijklmnopqrstu"}]}'
The secret will be redacted to [REDACTED ANTHROPIC_API_KEY] before the request reaches the model.

Check Behavior

Each check has two configurable aspects:
  • Mode (block / redact / off) — controls what happens when a detection is found. Set server-side via environment variables on the Fiddler deployment.
  • Threshold and entities — controls detection sensitivity. Set as proxy-wide defaults via additional_provider_specific_params in the LiteLLM config.yaml.
If any of pii or secrets is specified in additional_provider_specific_params, only the checks explicitly listed with enabled: true will run — regardless of server-side environment variable settings. To tune one check without silently disabling the other, list both keys with explicit enabled: true or enabled: false.

Secrets

Detects credentials, API keys, and tokens. Default mode: redact (set via GUARDRAILS_SECRETS_MODE on the Fiddler server)
ModeBehavior in free-text messagesBehavior in tool call arguments
redact (default)Value replaced with [REDACTED <type>], request forwardedAlways blocked — cannot safely redact structured JSON
blockRequest blocked entirelyRequest blocked
offCheck skippedCheck skipped

PII

Detects personal identifiable information. Default mode: redact (set via GUARDRAILS_PII_MODE on the Fiddler server)
ModeBehavior in free-text messagesBehavior in tool call arguments
redact (default)Value replaced with [REDACTED <type>], request forwardedAlways blocked — cannot safely redact structured JSON
blockRequest blocked entirelyRequest blocked
offCheck skippedCheck skipped
Entities checked by default include: person, email, phone number, social security number, credit card number, bank account number, passport number, driver's license number, date of birth, address, ip address, iban, cvv, cvc, tax identification number, digital signature, license plate number, postal code, and more. For the complete list see the PII Detection tutorial. Custom entities and threshold can be configured as proxy-wide defaults via additional_provider_specific_params in your LiteLLM config.yaml:
additional_provider_specific_params:
  pii:
    enabled: true
    config:
      threshold: 0.8        # detection confidence threshold (default: 0.8)
      entities:             # optional: override default entity list
        - person
        - email
        - phone number

Tool Call Handling

When an LLM’s tool_calls contain PII or secrets, Fiddler always blocks rather than redacts. Why: Tool call arguments are structured JSON that the downstream application will parse and execute. Replacing a value like an email address with [REDACTED EMAIL_ADDRESS] would cause send_email to attempt delivery to a nonsensical address — producing unpredictable behavior that is worse than blocking outright. Example — blocked tool call:
{
  "role": "assistant",
  "tool_calls": [{
    "id": "call_abc123",
    "type": "function",
    "function": {
      "name": "send_email",
      "arguments": "{\"to\": \"robin@example.com\", \"body\": \"Phone: 9916308047\"}"
    }
  }]
}
Fiddler returns:
{"action": "BLOCKED", "blocked_reason": "PII or secrets detected in tool call arguments. Cannot redact tool call arguments — blocking request."}
LiteLLM then translates this into an error for the client (HTTP 400 on LiteLLM ≥ 1.87.0, HTTP 500 on earlier versions).

API Reference

Endpoint

POST /v3/guardrails/litellm/beta/litellm_basic_guardrail_api
Authentication: Authorization: Bearer <your-fiddler-api-key>

Request

{
  "texts": ["string"],
  "tool_calls": [{"id": "...", "type": "function", "function": {"name": "...", "arguments": "..."}}],
  "tools": [{"type": "function", "function": {"name": "get_weather", "description": "Get current weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}}}],
  "structured_messages": [{"role": "user", "content": "..."}],
  "images": ["base64-encoded-string"],
  "request_data": {"user_api_key_alias": "my-key", "user_api_key_team_id": "team-123"},
  "request_headers": {"content-type": "application/json"},
  "input_type": "request",
  "additional_provider_specific_params": {
    "pii": {"enabled": true, "config": {"threshold": 0.8}},
    "secrets": {"enabled": true, "config": {}}
  },
  "litellm_call_id": "uuid",
  "litellm_trace_id": "uuid",
  "litellm_version": "1.87.0"
}
Only texts and tool_calls[].function.arguments are scanned by guardrail checks. The remaining fields are accepted for LiteLLM protocol compatibility.
FieldTypeRequiredDescription
textsstring[]YesExtracted text strings from the request messages (scanned)
request_dataobjectYesLiteLLM virtual key metadata (user, team, org, key hash)
input_type"request" | "response"YesWhether this is a pre-call or post-call check
tool_callsobject[] | nullNoTool invocations in OpenAI format (arguments are scanned)
toolsobject[] | nullNoTool definitions in OpenAI format (not scanned)
structured_messagesobject[] | nullNoFull messages array in OpenAI format (not scanned)
imagesstring[] | nullNoBase64-encoded images (not scanned)
request_headersobject | nullNoInbound request headers (not scanned)
additional_provider_specific_paramsobject | nullNoPer-check configuration (see Quick Start)
litellm_call_idstring | nullNoLiteLLM call ID for tracing
litellm_trace_idstring | nullNoLiteLLM trace ID for tracing
litellm_versionstring | nullNoLiteLLM library version

Response

Fields with null values are omitted from the wire (exclude_none=True). The response shape varies by action:
// action: NONE
{"action": "NONE"}

// action: BLOCKED
{"action": "BLOCKED", "blocked_reason": "PII or secrets detected in tool call arguments. Cannot redact tool call arguments — blocking request."}

// action: GUARDRAIL_INTERVENED
{"action": "GUARDRAIL_INTERVENED", "texts": ["My key is [REDACTED ANTHROPIC_API_KEY]"]}
FieldTypeDescription
actionstringOne of NONE, BLOCKED, or GUARDRAIL_INTERVENED
blocked_reasonstringHuman-readable reason; present only when action is BLOCKED
textsstring[]Redacted texts; present only when action is GUARDRAIL_INTERVENED

Observed Behavior

Minimum required version: LiteLLM ≥ 1.87.0. Earlier versions return HTTP 500 for all guardrail blocks due to a bug in GuardrailRaisedException (missing status_code attribute). The Fiddler team identified and fixed this upstream in BerriAI/litellm#27617. On 1.87.0+, blocked requests correctly return HTTP 400.
ScenarioActionHTTP statusNotes
Secret in user messageGUARDRAIL_INTERVENED200Redacted in-place; LLM receives sanitized text
PII in tool call argumentsBLOCKED400 (500 pre-1.87)Cannot redact structured JSON
Guardrail service unreachablePass-through200LiteLLM fails open by default