Overview
AgentGateway (v1.1.0+, Apache 2.0) is an open-source Rust proxy that sits between your application and its LLM provider. Fiddler integrates with AgentGateway at the proxy layer, giving you full LLM observability — prompts, responses, token usage, latency — without adding any SDK to your application code.
| Capability | Notes |
|---|
| Zero instrumentation | Point your app at AgentGateway instead of api.openai.com — no code changes needed |
| LLM span tracing | Token counts, model name, latency, and content (with CEL config) |
| Session grouping | All LLM calls sharing an X-Fiddler-Conversation-Id header are grouped into one Fiddler Session |
| Direct OTLP export | AgentGateway exports traces to Fiddler over HTTPS with auth headers injected via requestHeaderModifier |
Architecture
AgentGateway exposes an OpenAI-compatible API (/v1/chat/completions). Your application requires no SDK — it just calls the proxy instead of the provider directly.
Prerequisites
- Fiddler account with a GenAI application already created
- AgentGateway v1.1.0 or later
- A valid LLM provider API key (e.g.
OPENAI_API_KEY)
- Your Fiddler API key (found under organizational settings) and application UUID (found under application settings)
Quick Start
Step 1 — Install AgentGateway
# macOS / Linux (Homebrew)
brew install agentgateway/tap/agentgateway
# Or download from https://github.com/agentgateway/agentgateway/releases
agentgateway --version # must be 1.1.0+
Create agentgateway_config.yaml:
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
frontendPolicies:
tracing:
# Fiddler instance host and port (e.g., your-instance.fiddler.ai:443).
# $FIDDLER_HOST is expanded from the environment.
host: "$FIDDLER_HOST"
protocol: http
randomSampling: true
policies:
# Inject auth headers on every OTLP export request to Fiddler.
# $FIDDLER_API_KEY and $FIDDLER_APP_ID are expanded from the environment.
requestHeaderModifier:
add:
Authorization: "Bearer $FIDDLER_API_KEY"
fiddler-application-id: "$FIDDLER_APP_ID"
# Enable TLS for the HTTPS connection to Fiddler.
backendTLS: {}
resources:
service.name: '"agentgateway"'
application.id: '"$FIDDLER_APP_ID"'
attributes:
gen_ai.llm.input.user: |
llm.prompt.filter(m, m.role == "user").map(m, m.content).join("\n")
gen_ai.llm.input.system: |
llm.prompt.filter(m, m.role == "system").map(m, m.content).join("\n")
gen_ai.llm.output: |
llm.completion.join("\n")
gen_ai.input.messages: toJson(llm.prompt)
gen_ai.output.messages: toJson(llm.completion)
gen_ai.system: llm.provider
gen_ai.usage.total_tokens: llm.totalTokens
span.name: |
"chat " + llm.requestModel
fiddler.span.type: '"llm"'
gen_ai.tool.definitions: 'toJson(json(request.body).tools)'
gen_ai.conversation.id: |
request.headers["x-fiddler-conversation-id"] != "" ? request.headers["x-fiddler-conversation-id"] : ""
binds:
- port: 4000
listeners:
- routes:
- backends:
- ai:
name: openai
provider:
openAI:
model: gpt-4o-mini
policies:
backendAuth:
passthrough: {}
The frontendPolicies.tracing block captures prompt and response content via CEL expressions and exports spans directly to Fiddler over HTTPS. $FIDDLER_API_KEY and $FIDDLER_APP_ID are expanded from environment variables at runtime — no credentials are hardcoded in the config file.
Step 3 — Start AgentGateway
export OPENAI_API_KEY="sk-..."
export FIDDLER_API_KEY="your-fiddler-api-key"
export FIDDLER_APP_ID="your-application-uuid"
export FIDDLER_HOST="your-instance.fiddler.ai:443"
agentgateway -f agentgateway_config.yaml
Step 4 — Point your application at AgentGateway
import os
import uuid
from openai import OpenAI
# OPENAI_API_KEY is read from the environment automatically.
# AgentGateway's backendAuth passthrough forwards it unchanged to OpenAI,
# so the key must be set in both the AgentGateway environment and here.
client = OpenAI(base_url=os.getenv("AGENTGATEWAY_URL", "http://localhost:4000/v1"))
# Generate one conversation UUID per session and send it on every LLM call
# so Fiddler groups them into a single Session.
session_id = str(uuid.uuid4())
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What is the capital of France?"}],
extra_headers={"X-Fiddler-Conversation-Id": session_id},
)
print(response.choices[0].message.content)
Traces appear in Fiddler automatically — no SDK import, no callback registration, no changes to your application logic.
To override the default proxy URL, set AGENTGATEWAY_URL (defaults to http://localhost:4000/v1):
export AGENTGATEWAY_URL="http://my-gateway-host:4000/v1"
Step 5 — Verify traces are arriving
Open the Fiddler UI and navigate to your application’s Trace Explorer. You should see the trace within a few seconds of making your first completion call.
Span Type Mapping
Fiddler classifies AgentGateway spans based on gen_ai.operation.name, which AgentGateway sets automatically on every LLM proxy call:
gen_ai.operation.name | Fiddler span type |
|---|
chat | llm |
completion | llm |
| anything else | (skipped) |
Fiddler’s AgentGateway mapper checks this attribute and sets fiddler.span.type = "llm" internally — no CEL config is required for classification. The fiddler.span.type: '"llm"' line in the CEL config above is a safety net for Fiddler deployments that process spans without the dedicated mapper.
Attribute Mapping
AgentGateway uses slightly different attribute names from the OpenTelemetry GenAI semantic conventions. Fiddler’s mapper normalises these automatically:
| AgentGateway attribute | Fiddler treatment |
|---|
gen_ai.provider.name | Copied to gen_ai.system when absent (CEL sets it directly) |
gen_ai.usage.input_tokens | Mapped to fiddler.span.system.gen_ai.usage.input_tokens |
gen_ai.usage.output_tokens | Mapped to fiddler.span.system.gen_ai.usage.output_tokens |
gen_ai.usage.total_tokens | Computed as input + output when absent |
gen_ai.agent.name | Defaulted to <UNKNOWN_AGENT> when absent |
gen_ai.conversation.id | Defaulted to trace_id.hex() when absent — set by CEL from the X-Fiddler-Conversation-Id header to enable Session grouping |
gateway, listener, route, endpoint, http.method, http.path, http.status, duration, src.addr | Passed through unchanged — AgentGateway routing metadata |
Content attributes (gen_ai.llm.input.user, gen_ai.llm.input.system, gen_ai.llm.output) are set by the CEL config in AgentGateway — no JSON parsing is required by the mapper.
Session Grouping
Fiddler groups all LLM calls that share the same gen_ai.conversation.id into a single Session. The recommended pattern is to generate one UUID per logical conversation in your application and pass it on every LLM call as the X-Fiddler-Conversation-Id HTTP header. The CEL expression in the AgentGateway config (see Step 2) extracts the header and stamps it as the span attribute.
The header transport is preferred over OpenAI’s metadata request body field because:
- OpenAI’s
metadata parameter requires store=true, which persists conversation data on OpenAI’s side — a privacy concern for many customers.
- AgentGateway is a passthrough proxy: anything in the request body must be a valid OpenAI parameter or the request fails.
- Headers are visible to AgentGateway and silently stripped by OpenAI.
Troubleshooting
Traces not appearing in Fiddler
Verify all three environment variables are set before starting AgentGateway:
echo $FIDDLER_API_KEY
echo $FIDDLER_APP_ID
echo $FIDDLER_HOST
echo $OPENAI_API_KEY
Both application.id (OTel resource attribute) and fiddler-application-id (HTTP header on the export request) are required. If either is missing or does not match a valid Fiddler application UUID, spans are silently dropped.
Prompt and response content not showing
The frontendPolicies.tracing.attributes CEL block is required. Verify it is present in agentgateway_config.yaml and that AgentGateway is v1.1.0+:
Span type showing as Unknown
Fiddler classifies spans using gen_ai.operation.name (set automatically by AgentGateway). Verify that AgentGateway is v1.1.0+ and that the frontendPolicies.tracing.attributes block is present in your config. As a fallback, ensure fiddler.span.type: '"llm"' is included in the attributes block — this covers Fiddler deployments that process spans without the dedicated AgentGateway mapper.
Not all LLM calls are producing traces
randomSampling: true is active. Set it to false to capture every span:
frontendPolicies:
tracing:
randomSampling: false
Known Limitations
| Limitation | Details |
|---|
| Content requires CEL config | Prompt and response text are only captured when frontendPolicies.tracing.attributes is configured in AgentGateway |
| LLM-only scope | This integration captures LLM proxy traffic only. MCP tool calls and A2A agent-to-agent calls proxied by AgentGateway are not currently forwarded to Fiddler. |
| Sampling | randomSampling: true means not every LLM call produces a trace; set to false for complete capture |