LiteLLM Guardrails

Overview

Fiddler implements the LiteLLM Generic Guardrail API spec, allowing you to plug Fiddler’s guardrails directly into a LiteLLM proxy gateway. Once configured, every LLM request routed through the proxy is checked by Fiddler before it reaches the model. Fiddler checks for:

Secrets — API keys, tokens, credentials, and connection strings in prompts
PII — Personal identifiable information (names, emails, phone numbers, SSNs, credit cards, etc.)

Each check can independently block or redact — see Check Behavior for details.

How It Works

LiteLLM calls the Fiddler endpoint with the extracted text from the request (or response). Fiddler returns one of three actions:

Action	Meaning	LiteLLM behavior
`NONE`	No issues detected	Forwards the request unchanged
`GUARDRAIL_INTERVENED`	Sensitive content redacted	Forwards the request with redacted `texts`
`BLOCKED`	Request must not proceed	Returns an error to the client

Supported Modes

Fiddler supports all three LiteLLM guardrail modes. Set mode in your proxy config to one or more of:

Mode	When it runs	What gets checked	Can modify input?
`pre_call`	Before the LLM call	PII + Secrets	Yes (redact in place)
`post_call`	After the LLM call	PII + Secrets	Yes (redact in place)
`during_call`	In parallel with the LLM call	PII + Secrets	No (accept or reject only)

during_call runs the guardrail concurrently with the LLM call — the response is held until the check completes, but latency is hidden behind the LLM round-trip. Use it when you want lower end-to-end latency and don’t need input redaction (block-only is sufficient).

You can combine modes. For example, mode: [pre_call, post_call] scans both input and output:

guardrails:
  - guardrail_name: "fiddler"
    litellm_params:
      guardrail: generic_guardrail_api
      mode: [pre_call, post_call]

Supported Endpoints

The guardrail runs on all LiteLLM proxy endpoints that carry text content, including /v1/chat/completions (OpenAI format) and /v1/messages (Anthropic format). LiteLLM extracts text from the request and forwards it to the Fiddler endpoint regardless of the upstream provider format.

What Gets Scanned

Check	`pre_call`	`post_call`	`during_call`	Free-text messages	Tool-call args
PII	✓	✓	✓	Redact	Block
Secrets	✓	✓	✓	Redact	Block

Free-text messages — user, assistant, and system messages. PII/secrets are redacted in place (e.g. [REDACTED EMAIL_ADDRESS]).
Tool-call arguments — structured JSON in tool_calls[].function.arguments. Detections are always blocked (cannot safely redact inside structured JSON — see Tool Call Handling).

Quick Start

Step 1: Configure LiteLLM

Add the Fiddler guardrail to your LiteLLM config.yaml:

guardrails:
  - guardrail_name: "fiddler"
    litellm_params:
      guardrail: generic_guardrail_api
      mode: [pre_call]
      api_base: https://<your-fiddler-instance>/v3/guardrails/litellm
      default_on: true
      # Block if the Fiddler endpoint is unreachable (network error / 5xx).
      # Covers transport failures. Default: fail_closed.
      unreachable_fallback: fail_closed
      headers:
        Authorization: "Bearer <your-fiddler-api-key>"
      additional_provider_specific_params:
        # Block if an internal check fails to complete (inference timeout/error).
        # Covers detector failures. Default when omitted: open.
        failure_mode: closed
        # Wall-clock timeout (seconds) for the guardrail check. How long the
        # endpoint blocks before returning. Default: 12. Max: 60.
        timeout: 12
        pii:
          enabled: true
          config:
            threshold: 0.8
            mode: redact          # "redact" (default) or "block"
        secrets:
          enabled: true
          config:
            mode: redact          # "redact" (default) or "block"

LiteLLM appends /beta/litellm_basic_guardrail_api to api_base automatically. The full endpoint called will be https://<your-fiddler-instance>/v3/guardrails/litellm/beta/litellm_basic_guardrail_api.

Step 2: Start the proxy

litellm --config config.yaml --port 4000

Step 3: Verify

Send a request with a known secret:

curl -i http://localhost:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <litellm-master-key>" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"My API key is sk-ant-api03-abcdefghijklmnopqrstu"}]}'

The secret will be redacted to [REDACTED ANTHROPIC_API_KEY] before the request reaches the model.

Per-Request Control

With default_on: true (the recommended config above), the guardrail runs on every request automatically. You can also control it per request.

Selective activation (`default_on: false`)

Set default_on: false in the proxy config, then activate the guardrail on individual requests by passing guardrails in the request body:

{
  "model": "gpt-4o-mini",
  "messages": [{"role": "user", "content": "..."}],
  "guardrails": ["fiddler"]
}

Requests without "guardrails": ["fiddler"] will bypass the guardrail entirely.

Check Behavior

Each check is configured entirely in the LiteLLM proxy’s additional_provider_specific_params:

enabled (true / false) — whether the check runs at all.
mode (redact / block) — what happens when a detection is found. Default is redact for PII and secrets.
threshold and entities — detection sensitivity.

If any of pii or secrets is specified in additional_provider_specific_params, only the checks explicitly listed with enabled: true will run. To tune one check without silently disabling the other, list both keys with explicit enabled: true or enabled: false.

Secrets

Detects credentials, API keys, and tokens.

Mode	Behavior in free-text messages	Behavior in tool call arguments
`redact` (default)	Value replaced with `[REDACTED <type>]`, request forwarded	Always blocked — cannot safely redact structured JSON
`block`	Request blocked entirely	Request blocked

secrets:
  enabled: true
  config:
    mode: redact            # "redact" (default) or "block"

PII

Detects personal identifiable information.

Mode	Behavior in free-text messages	Behavior in tool call arguments
`redact` (default)	Value replaced with `[REDACTED <type>]`, request forwarded	Always blocked — cannot safely redact structured JSON
`block`	Request blocked entirely	Request blocked

Entities checked by default include: person, email, phone number, social security number, credit card number, bank account number, passport number, driver's license number, date of birth, address, ip address, iban, cvv, cvc, tax identification number, digital signature, license plate number, postal code, and more. For the complete list see the PII Detection tutorial.

pii:
  enabled: true
  config:
    mode: redact            # "redact" (default) or "block"
    threshold: 0.8          # detection confidence threshold (default: 0.8)
    entities:               # optional: override default entity list
      - person
      - email
      - phone number

Tool Call Handling

When an LLM’s tool_calls contain PII or secrets, Fiddler always blocks rather than redacts. Why: Tool call arguments are structured JSON that the downstream application will parse and execute. Replacing a value like an email address with [REDACTED EMAIL_ADDRESS] would cause send_email to attempt delivery to a nonsensical address — producing unpredictable behavior that is worse than blocking outright. Example — blocked tool call:

{
  "role": "assistant",
  "tool_calls": [{
    "id": "call_abc123",
    "type": "function",
    "function": {
      "name": "send_email",
      "arguments": "{\"to\": \"jane.doe@example.com\", \"body\": \"Phone: 555-867-5309\"}"
    }
  }]
}

Fiddler returns:

{"action": "BLOCKED", "blocked_reason": "PII or secrets detected in tool call arguments. Cannot redact tool call arguments — blocking request."}

LiteLLM then translates this into an error for the client (HTTP 400 on LiteLLM ≥ 1.88.0, HTTP 500 on earlier versions).

Tool Results

Fiddler does not claim redaction coverage for tool results. Whether tool result content reaches the scanner depends on the LiteLLM gateway: LiteLLM must include the tool result in the texts[] it forwards to Fiddler. This is not guaranteed for all gateway configurations or versions, and is known not to work on customer-managed or self-hosted gateways that do not extract role:tool / tool_result messages into texts[]. Additionally, tool results are typed as any in the GenAI semantic conventions — they can be plain strings, JSON objects, or arrays. Even when the text does reach the scanner, in-place character-span redaction on a serialized JSON payload is not safe if the downstream application re-parses the result as structured data. For reliable protection against secrets in tool output, use mode: block on secrets and treat tool result content as untrusted.

Failure Mode

Two settings control what happens when the guardrail cannot complete a check — one at the LiteLLM proxy level, one at the Fiddler endpoint level. Both should be set for end-to-end fail-closed.

`unreachable_fallback` (LiteLLM proxy)

Controls what happens when the Fiddler endpoint is unreachable (network error, HTTP 502/503/504).

litellm_params:
  unreachable_fallback: fail_closed  # default: fail_closed

Value	Behavior
`fail_closed` (default)	Block the request — the LLM never sees it
`fail_open`	Allow the request through without scanning

This is a standard LiteLLM Generic Guardrail API setting.

`failure_mode` (Fiddler endpoint)

Controls what happens when an internal check fails to complete — an inference timeout, an inference server error, or an unexpected exception in the detection pipeline. The LiteLLM proxy cannot see these failures because the Fiddler endpoint still returns HTTP 200.

additional_provider_specific_params:
  failure_mode: closed  # default when omitted: open

Value	Behavior
`open` (default)	Allow the request through — the failed check is logged but does not block
`closed`	Block the request — unscanned content never reaches the LLM

The default is open for backward compatibility. For security-sensitive deployments, set failure_mode: closed. When both unreachable_fallback: fail_closed and failure_mode: closed are set, no request can bypass scanning — whether the failure is at the transport or detector level.

Limits & Timeouts

Text Length

The maximum total text length scanned per request is controlled by the GUARDRAILS_MAX_TEXT_LENGTH environment variable on the Fiddler server (default: 50,000 characters). Messages are concatenated until this cap is reached; text beyond the cap is handled as follows:

Content type	Fail-open (`failure_mode: open`)	Fail-closed (`failure_mode: closed`)
Free-text messages	Skipped (unscanned, request proceeds)	Blocked
Tool-call arguments	Always blocked (regardless of failure mode)	Blocked

Timeouts

The guardrail check has a single customer-facing timeout: the wall-clock deadline for the whole check pipeline — how long the endpoint blocks before returning. Set it per request via timeout (seconds) in additional_provider_specific_params:

additional_provider_specific_params:
  timeout: 12               # default: 12, max: 60

What you set is what you get — there is no hidden padding. Omitted or invalid values fall back to the default (12s); values above the maximum are clamped to 60s. When the deadline is reached before the checks complete, the outcome is governed by failure_mode — under open the request proceeds unscanned, under closed it is blocked.

The internal inference-call timeouts (GUARDRAILS_GATEWAY_READ_TIMEOUT, default 10s, and GUARDRAILS_GATEWAY_CONN_TIMEOUT, default 3s) are server-side plumbing for the connection to the detection models. They are lower than the wall-clock timeout and are not part of the customer-facing config.

Observed Behavior

Minimum required version: LiteLLM ≥ 1.88.0. Earlier versions return HTTP 500 for all guardrail blocks due to a bug in GuardrailRaisedException (missing status_code attribute). The Fiddler team identified and fixed this upstream in BerriAI/litellm#27617. The fix shipped in LiteLLM 1.88.0. On 1.88.0+, blocked requests correctly return HTTP 400.

Scenario	Action	HTTP status	Notes
Secret in user message	`GUARDRAIL_INTERVENED`	200	Redacted in-place; LLM receives sanitized text
PII in user message	`GUARDRAIL_INTERVENED`	200	Redacted in-place
PII in tool call arguments	`BLOCKED`	400 (500 pre-1.88.0)	Cannot redact structured JSON
Detector timeout (`failure_mode: open`)	`NONE`	200	Request proceeds unscanned
Detector timeout (`failure_mode: closed`)	`BLOCKED`	200	Request blocked; unscanned content never reaches LLM
Guardrail service unreachable (`unreachable_fallback: fail_closed`)	(LiteLLM-side block — Fiddler never reached)	400	LiteLLM rejects the request before forwarding to the guardrail

Client-Facing Error Body

When a request is blocked, LiteLLM returns an error to the client. Your application should handle this shape:

{
  "error": {
    "message": "Guardrail fiddler rejected the request. PII or secrets detected in tool call arguments. Cannot redact tool call arguments — blocking request.",
    "type": "None",
    "param": "None",
    "code": "400"
  }
}

The message field contains the blocked_reason from Fiddler’s response, prefixed by LiteLLM with the guardrail name. HTTP status is 400 on LiteLLM ≥ 1.88.0.

API Reference

Endpoint

POST /v3/guardrails/litellm/beta/litellm_basic_guardrail_api

Authentication: Authorization: Bearer <your-fiddler-api-key>

Request

{
  "texts": ["string"],
  "tool_calls": [{"id": "...", "type": "function", "function": {"name": "...", "arguments": "..."}}],
  "tools": [{"type": "function", "function": {"name": "get_weather", "description": "Get current weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}}}],
  "structured_messages": [{"role": "user", "content": "..."}],
  "images": ["base64-encoded-string"],
  "request_data": {"user_api_key_alias": "my-key", "user_api_key_team_id": "team-123"},
  "request_headers": {"content-type": "application/json"},
  "input_type": "request",
  "additional_provider_specific_params": {
    "failure_mode": "closed",
    "timeout": 12,
    "pii": {"enabled": true, "config": {"threshold": 0.8, "mode": "redact"}},
    "secrets": {"enabled": true, "config": {"mode": "redact"}}
  },
  "litellm_call_id": "uuid",
  "litellm_trace_id": "uuid",
  "litellm_version": "1.88.0"
}

Only texts and tool_calls[].function.arguments are scanned by guardrail checks. The remaining fields are accepted for LiteLLM protocol compatibility.

Field	Type	Required	Description
`texts`	`string[]`	Yes	Extracted text strings from the request messages (scanned)
`request_data`	`object`	Yes	LiteLLM virtual key metadata (user, team, org, key hash)
`input_type`	`"request"` \| `"response"`	Yes	Whether this is a pre-call or post-call check
`tool_calls`	`object[]` \| `null`	No	Tool invocations in OpenAI format (arguments are scanned)
`tools`	`object[]` \| `null`	No	Tool definitions in OpenAI format (not scanned)
`structured_messages`	`object[]` \| `null`	No	Full messages array in OpenAI format (not scanned)
`images`	`string[]` \| `null`	No	Base64-encoded images (not scanned)
`request_headers`	`object` \| `null`	No	Inbound request headers (not scanned)
`additional_provider_specific_params`	`object` \| `null`	No	Per-check configuration plus `failure_mode` and `timeout` (see Quick Start, Failure Mode, and Timeouts)
`litellm_call_id`	`string` \| `null`	No	LiteLLM call ID for tracing
`litellm_trace_id`	`string` \| `null`	No	LiteLLM trace ID for tracing
`litellm_version`	`string` \| `null`	No	LiteLLM library version

Response

Fields with null values are omitted from the wire (exclude_none=True). The response shape varies by action:

// action: NONE
{"action": "NONE"}

// action: BLOCKED
{"action": "BLOCKED", "blocked_reason": "PII or secrets detected in tool call arguments. Cannot redact tool call arguments — blocking request."}

// action: GUARDRAIL_INTERVENED
{"action": "GUARDRAIL_INTERVENED", "texts": ["My key is [REDACTED ANTHROPIC_API_KEY]"]}

Field	Type	Description
`action`	`string`	One of `NONE`, `BLOCKED`, or `GUARDRAIL_INTERVENED`
`blocked_reason`	`string`	Human-readable reason; present only when `action` is `BLOCKED`
`texts`	`string[]`	Redacted texts; present only when `action` is `GUARDRAIL_INTERVENED`

Getting Started

Concepts

Evaluate & Test

Reference

LiteLLM Guardrails

LiteLLM Guardrails

Overview

How It Works

Supported Modes

Supported Endpoints

What Gets Scanned

Quick Start

Step 1: Configure LiteLLM

Step 2: Start the proxy

Step 3: Verify

Per-Request Control

Selective activation (`default_on: false`)

Check Behavior

Secrets

PII

Tool Call Handling

Tool Results

Failure Mode

`unreachable_fallback` (LiteLLM proxy)

`failure_mode` (Fiddler endpoint)

Limits & Timeouts

Text Length

Timeouts

Observed Behavior

Client-Facing Error Body

API Reference

Endpoint

Request

Response

​LiteLLM Guardrails

​Overview

​How It Works

​Supported Modes

​Supported Endpoints

​What Gets Scanned

​Quick Start

​Step 1: Configure LiteLLM

​Step 2: Start the proxy

​Step 3: Verify

​Per-Request Control

​Selective activation (default_on: false)

​Check Behavior

​Secrets

​PII

​Tool Call Handling

​Tool Results

​Failure Mode

​unreachable_fallback (LiteLLM proxy)

​failure_mode (Fiddler endpoint)

​Limits & Timeouts

​Text Length

​Timeouts

​Observed Behavior

​Client-Facing Error Body

​API Reference

​Endpoint

​Request

​Response

​Related Documentation

LiteLLM Guardrails

Overview

How It Works

Supported Modes

Supported Endpoints

What Gets Scanned

Quick Start

Step 1: Configure LiteLLM

Step 2: Start the proxy

Step 3: Verify

Per-Request Control

Selective activation (`default_on: false`)

Check Behavior

Secrets

PII

Tool Call Handling

Tool Results

Failure Mode

`unreachable_fallback` (LiteLLM proxy)

`failure_mode` (Fiddler endpoint)

Limits & Timeouts

Text Length

Timeouts

Observed Behavior

Client-Facing Error Body

API Reference

Endpoint

Request

Response

Related Documentation