What You’ll Learn
This interactive notebook demonstrates advanced monitoring patterns for production LangGraph applications through a realistic travel planning system with multiple specialized agents. Key Topics Covered:- Multi-agent workflow monitoring and orchestration
- Custom instrumentation with decorators and span wrappers
- Combining auto-instrumentation with fine-grained manual spans
- Conversation tracking across complex interactions
- Production configuration for high-volume scenarios
- Advanced error handling and recovery patterns
- Business intelligence integration and analytics
Interactive Tutorial
The notebook walks through building a comprehensive travel planning application featuring hotel search, weather analysis, itinerary planning, and supervisor agents working together. Open the Advanced Observability Notebook in Google Colab → Or download the notebook directly from GitHub →Custom Instrumentation Tutorial
For hands-on examples of decorator-based and manual instrumentation, including@trace(), span wrappers, and async support:
Open the Custom Instrumentation Notebook in Google Colab →
Or download the notebook directly from GitHub →
Custom Instrumentation Patterns
The SDK supports three instrumentation approaches. You can use them individually or combine them in the same application. For complete API reference, see the Instrumentation Methods section in the integration guide.Combining Auto-Instrumentation with Decorators
UseLangGraphInstrumentor for automatic LangGraph/LangChain tracing, then add @trace() decorators to capture custom business logic that runs outside the framework:
Multi-Agent Decorator Patterns
When building multi-agent systems,@trace() decorators automatically establish parent-child span relationships through nested function calls:
Using Span Wrappers for Typed Attributes
Span wrapper classes provide typed helper methods for setting semantic attributes on LLM calls, tool invocations, and chain operations. Use them withstart_as_current_span() for fine-grained control:
Production Configuration Best Practices
Before deploying LangGraph applications to production, configure the SDK for your specific workload characteristics.High-Volume Applications
Optimize for applications processing thousands of traces per minute:Low-Latency Requirements
Optimize for applications requiring sub-second trace export:Memory-Constrained Environments
Configure conservative limits for edge deployments or containerized environments:Development vs Production Configurations
Development Configuration:Best Practices for Context and Conversation IDs
Structure your identifiers for maximum analytical value:Prerequisites
- Fiddler account with API credentials
- OpenAI API key for example interactions
- Basic familiarity with LangGraph concepts
Time Required
- Complete tutorial: 45-60 minutes
- Quick overview: 15-20 minutes
Telemetry Data Reference
Understanding the data captured by the Fiddler LangGraph SDK.Span Attributes
The SDK automatically captures these OpenTelemetry attributes:| Attribute | Type | Description |
|---|---|---|
gen_ai.agent.name | str | Name of the AI agent (auto-extracted from LangGraph, configurable for LangChain) |
gen_ai.agent.id | str | Unique identifier (format: trace_id:agent_name) |
gen_ai.conversation.id | str | Session identifier set via set_conversation_id() |
fiddler.span.type | str | Span classification: chain, tool, llm, or agent |
gen_ai.llm.input.system | str | System prompt content |
gen_ai.llm.input.user | str | User input/prompt |
gen_ai.llm.output | str | Model response text |
gen_ai.llm.context | str | Custom context set via set_llm_context() |
gen_ai.request.model | str | Model identifier (e.g., “gpt-4o-mini”) |
gen_ai.llm.token_count | int | Token usage metrics |
gen_ai.tool.name | str | Tool function name |
gen_ai.tool.input | str | Tool input parameters (JSON) |
gen_ai.tool.output | str | Tool execution results (JSON) |
gen_ai.tool.definitions | str | Tool definitions available to the LLM (JSON array of OpenAI-format tool schemas) |
gen_ai.input.messages | str | Complete message history provided as input to the LLM (JSON array) |
gen_ai.output.messages | str | Output messages generated by the LLM, including tool calls (JSON array) |
duration_ms | float | Span duration in milliseconds |
fiddler.error.message | str | Error message (if span failed) |
fiddler.error.type | str | Error type classification |
Setting Attributes with Span Wrappers
When using manual instrumentation, span wrapper classes provide typed helper methods that set these attributes automatically. For example,FiddlerGeneration.set_model("gpt-4o") sets gen_ai.request.model, and FiddlerTool.set_tool_name("search") sets gen_ai.tool.name.
| Span Wrapper | Key Methods | Attributes Set |
|---|---|---|
FiddlerGeneration | set_model(), set_system_prompt(), set_user_prompt(), set_completion(), set_usage(), set_messages(), set_output_messages(), set_tool_definitions() | gen_ai.request.model, gen_ai.llm.input.*, gen_ai.llm.output, gen_ai.usage.*, gen_ai.input.messages, gen_ai.output.messages, gen_ai.tool.definitions |
FiddlerTool | set_tool_name(), set_tool_input(), set_tool_output(), set_tool_definitions() | gen_ai.tool.name, gen_ai.tool.input, gen_ai.tool.output, gen_ai.tool.definitions |
FiddlerChain | set_input(), set_output() | Input/output data attributes |
FiddlerSpan | set_attribute(), set_agent_name(), set_conversation_id() | Any custom or standard attribute |
Querying and Filtering in Fiddler
Use these attributes in the Fiddler UI to:- Filter by agent:
gen_ai.agent.name = "hotel_search_agent" - Find conversations:
gen_ai.conversation.id = "user-123_support_2026-06-15..." - Analyze by model:
gen_ai.request.model = "gpt-4o" - Track errors:
fiddler.error.type EXISTS
Who Should Use This
- AI engineers building production LangGraph applications
- DevOps teams monitoring agentic systems
- Technical leaders evaluating observability strategies
Limitations and Considerations
Current Limitations
- Framework Support: LangGraph is fully supported with automatic agent name extraction
- LangChain applications require manual agent name configuration
- Non-LangGraph Python code can use
@trace()decorators or manual context managers for custom instrumentation (see Instrumentation Methods)
- Protocol Support: Currently uses HTTP-based OTLP
- gRPC support planned for future releases
- Attribute Limits: Default OpenTelemetry limits apply
- Configurable via
span_limitsparameter - Very large attribute values may be truncated
- Configurable via
Performance Considerations
Overhead: Typical performance impact is < 5% with default settings- Use sampling to reduce overhead in high-volume scenarios
- Adjust batch processing delays based on latency requirements
- Default queue (100 spans) uses ~1-2MB
- Increase
OTEL_BSP_MAX_QUEUE_SIZEfor high throughput - Decrease for memory-constrained environments
- Gzip compression: ~70-80% reduction
- Use
Compression.NoCompressiononly for debugging
Production Deployment Checklist
Before deploying to production:- Set appropriate sampling rate (typically 5-10% for high-volume apps)
- Configure span limits based on your data characteristics
- Tune batch processing parameters for your traffic patterns
- Enable Gzip compression (default, recommended)
- Use environment variables for credentials (not hardcoded)
- Test instrumentation in staging environment first
- Monitor SDK performance impact
- Set up alerts for instrumentation failures
- Document your configuration for team knowledge sharing
When to Tune Each Setting
| Scenario | Configuration |
|---|---|
| High-volume production | Increase queue size, batch size, sampling rate |
| Low-latency requirements | Decrease schedule delay, smaller batches |
| Memory constraints | Decrease span limits, queue size, batch size |
| Development/debugging | Disable sampling, enable console tracer |
| Cost optimization | Increase sampling (lower %), enable compression |
Next Steps
After completing the tutorial:- Custom Instrumentation Notebook: Hands-on decorator and span wrapper examples
- Integration Guide: Instrumentation Methods reference for
@trace(), manual instrumentation, and span wrapper APIs - Technical Reference: Fiddler LangGraph SDK Documentation
- Production Deployment: Adapt the demonstrated patterns for your specific use case