LangGraph SDK Advanced

What You'll Learn

This interactive notebook demonstrates advanced monitoring patterns for production LangGraph applications through a realistic travel planning system with multiple specialized agents.

Key Topics Covered:

  • Multi-agent workflow monitoring and orchestration

  • Conversation tracking across complex interactions

  • Production configuration for high-volume scenarios

  • Advanced error handling and recovery patterns

  • Business intelligence integration and analytics

Interactive Tutorial

The notebook walks through building a comprehensive travel planning application featuring hotel search, weather analysis, itinerary planning, and supervisor agents working together.

Open the Advanced Observability Notebook in Google Colab →

Or download the notebook directly from GitHub →

Production Configuration Best Practices

Before deploying LangGraph applications to production, configure the SDK for your specific workload characteristics.

High-Volume Applications

Optimize for applications processing thousands of traces per minute:

import os
from opentelemetry.sdk.trace import SpanLimits, sampling
from opentelemetry.exporter.otlp.proto.http.trace_exporter import Compression
from fiddler_langgraph import FiddlerClient

# Configure batch processing BEFORE initializing FiddlerClient
os.environ['OTEL_BSP_MAX_QUEUE_SIZE'] = '500'         # Increased from default 100
os.environ['OTEL_BSP_SCHEDULE_DELAY_MILLIS'] = '500'  # Faster export than default 1000ms
os.environ['OTEL_BSP_MAX_EXPORT_BATCH_SIZE'] = '50'   # Larger batches than default 10
os.environ['OTEL_BSP_EXPORT_TIMEOUT'] = '10000'       # Longer timeout than default 5000ms

# Increase span limits to capture more data
production_limits = SpanLimits(
    max_events=128,                   # Default: 32
    max_links=64,                     # Default: 32
    max_span_attributes=128,          # Default: 32
    max_event_attributes=64,          # Default: 32
    max_link_attributes=32,           # Default: 32
    max_span_attribute_length=8192,   # Default: 2048
)

# Sample 5-10% of traces to manage data volume
production_sampler = sampling.TraceIdRatioBased(0.05)

client = FiddlerClient(
    api_key=os.getenv("FIDDLER_ACCESS_TOKEN"),
    application_id=os.getenv("FIDDLER_APPLICATION_ID"),
    url=os.getenv("FIDDLER_URL"),
    console_tracer=False,
    span_limits=production_limits,
    sampler=production_sampler,
    compression=Compression.Gzip,
)

Low-Latency Requirements

Optimize for applications requiring sub-second trace export:

# Reduce batch delay for faster exports
os.environ['OTEL_BSP_SCHEDULE_DELAY_MILLIS'] = '100'  # Export every 100ms
os.environ['OTEL_BSP_MAX_EXPORT_BATCH_SIZE'] = '5'    # Smaller batches

client = FiddlerClient(
    api_key=os.getenv("FIDDLER_ACCESS_TOKEN"),
    application_id=os.getenv("FIDDLER_APPLICATION_ID"),
    url=os.getenv("FIDDLER_URL"),
    compression=Compression.Gzip,  # Still use compression
)

Memory-Constrained Environments

Configure conservative limits for edge deployments or containerized environments:

memory_constrained_limits = SpanLimits(
    max_events=16,                  # Minimal event capture
    max_links=16,                   # Minimal linking
    max_span_attributes=32,         # Reduced attributes
    max_event_attributes=16,        # Reduced event attributes
    max_link_attributes=16,         # Reduced link attributes
    max_span_attribute_length=1024, # Shorter attribute values
)

os.environ['OTEL_BSP_MAX_QUEUE_SIZE'] = '50'           # Smaller queue
os.environ['OTEL_BSP_MAX_EXPORT_BATCH_SIZE'] = '5'    # Smaller batches

client = FiddlerClient(
    api_key=os.getenv("FIDDLER_ACCESS_TOKEN"),
    application_id=os.getenv("FIDDLER_APPLICATION_ID"),
    url=os.getenv("FIDDLER_URL"),
    span_limits=memory_constrained_limits,
    sampler=sampling.TraceIdRatioBased(0.1),  # Sample 10%
    compression=Compression.Gzip,
)

Development vs Production Configurations

Development Configuration:

# Capture everything with verbose debugging
dev_client = FiddlerClient(
    api_key=os.getenv("FIDDLER_API_KEY"),
    application_id=os.getenv("FIDDLER_APPLICATION_ID"),
    url=os.getenv("FIDDLER_URL"),
    console_tracer=True,    # Print spans to console
    sampler=None,           # Capture 100% of traces
)

Production Configuration:

# Optimized for performance and cost
prod_client = FiddlerClient(
    api_key=os.getenv("FIDDLER_API_KEY"),
    application_id=os.getenv("FIDDLER_APPLICATION_ID"),
    url=os.getenv("FIDDLER_URL"),
    console_tracer=False,                      # No console output
    sampler=sampling.TraceIdRatioBased(0.05),  # Sample 5%
    compression=Compression.Gzip,              # Reduce bandwidth
    span_limits=production_limits,             # Controlled limits
)

Best Practices for Context and Conversation IDs

Structure your identifiers for maximum analytical value:

from fiddler_langgraph.tracing.instrumentation import set_llm_context, set_conversation_id
import uuid

# Set meaningful, searchable context labels
set_llm_context(model, 'Customer Support - Tier 1 - Billing Inquiries')
set_llm_context(model, 'Content Generation - Marketing Copy - Blog Posts')
set_llm_context(model, 'Data Analysis - Financial Reports - Q4 2025')

# Use structured conversation IDs with metadata
user_id = 'user-12345'
session_type = 'support'
timestamp = '2025-10-17'
conversation_id = f'{user_id}_{session_type}_{timestamp}_{uuid.uuid4()}'
set_conversation_id(conversation_id)

# Example: user-12345_support_2025-10-17_550e8400-e29b-41d4-a716-446655440000

Prerequisites

  • Fiddler account with API credentials

  • OpenAI API key for example interactions

  • Basic familiarity with LangGraph concepts

Time Required

  • Complete tutorial: 45-60 minutes

  • Quick overview: 15-20 minutes

Telemetry Data Reference

Understanding the data captured by the Fiddler LangGraph SDK.

Span Attributes

The SDK automatically captures these OpenTelemetry attributes:

Attribute
Type
Description

gen_ai.agent.name

str

Name of the AI agent (auto-extracted from LangGraph, configurable for LangChain)

gen_ai.agent.id

str

Unique identifier (format: trace_id:agent_name)

gen_ai.conversation.id

str

Session identifier set via set_conversation_id()

fiddler.span.type

str

Span classification: chain, tool, llm, or other

gen_ai.llm.input.system

str

System prompt content

gen_ai.llm.input.user

str

User input/prompt

gen_ai.llm.output

str

Model response text

gen_ai.llm.context

str

Custom context set via set_llm_context()

gen_ai.llm.model

str

Model identifier (e.g., "gpt-4o-mini")

gen_ai.llm.token_count

int

Token usage metrics

gen_ai.tool.name

str

Tool function name

gen_ai.tool.input

str

Tool input parameters (JSON)

gen_ai.tool.output

str

Tool execution results (JSON)

duration_ms

float

Span duration in milliseconds

fiddler.error.message

str

Error message (if span failed)

fiddler.error.type

str

Error type classification

Querying and Filtering in Fiddler

Use these attributes in the Fiddler UI to:

  • Filter by agent: gen_ai.agent.name = "hotel_search_agent"

  • Find conversations: gen_ai.conversation.id = "user-123_support_2025-10-17..."

  • Analyze by model: gen_ai.llm.model = "gpt-4o"

  • Track errors: fiddler.error.type EXISTS

Who Should Use This

  • AI engineers building production LangGraph applications

  • DevOps teams monitoring agentic systems

  • Technical leaders evaluating observability strategies

Limitations and Considerations

Current Limitations

  • Framework Support: Only LangGraph is fully supported with automatic agent name extraction

    • LangChain applications require manual agent name configuration

    • Other frameworks must use the Client API directly

  • Protocol Support: Currently uses HTTP-based OTLP

    • gRPC support planned for future releases

  • Attribute Limits: Default OpenTelemetry limits apply

    • Configurable via span_limits parameter

    • Very large attribute values may be truncated

Performance Considerations

Overhead: Typical performance impact is < 5% with default settings

  • Use sampling to reduce overhead in high-volume scenarios

  • Adjust batch processing delays based on latency requirements

Memory: Span queue size affects the memory footprint

  • Default queue (100 spans) uses ~1-2MB

  • Increase OTEL_BSP_MAX_QUEUE_SIZE for high throughput

  • Decrease for memory-constrained environments

Network: Compression significantly reduces bandwidth usage

  • Gzip compression: ~70-80% reduction

  • Use Compression.NoCompression only for debugging

Production Deployment Checklist

Before deploying to production:

When to Tune Each Setting

Scenario
Configuration

High-volume production

Increase queue size, batch size, sampling rate

Low-latency requirements

Decrease schedule delay, smaller batches

Memory constraints

Decrease span limits, queue size, batch size

Development/debugging

Disable sampling, enable console tracer

Cost optimization

Increase sampling (lower %), enable compression

Next Steps

After completing the tutorial: