Guardrails Quick Start
Fiddler Guardrails provide real-time protection for your LLM applications by detecting and preventing harmful content, PII leaks, and hallucinations before they reach your users.
Time to complete: ~15 minutes
What You'll Learn
How to set up Fiddler Guardrails
How to use the three main guardrail types (Safety, PII, Faithfulness)
How to interpret risk scores
How to integrate guardrails into your LLM application
Prerequisites
Fiddler Guardrails Account: Sign up for Free Guardrails
API Key: Generated from your Fiddler Guardrails dashboard
Python 3.8+ (or any HTTP client)
Quick Start: Setting Up Guardrails
Step 1: Get Your API Key
Sign up at fiddler.ai/free-guardrails
Activate your account via email
Generate your API key from the dashboard
For detailed setup instructions, see the Guardrails Getting Started Guide.
Step 2: Install Required Libraries (Optional)
Step 3: Configure Your Connection
Guardrail Types and Usage
Each guardrail type has its own endpoint and request/response format. Choose the appropriate guardrail based on your protection needs.
🛡️ Safety Guardrails
Detect harmful, toxic, or jailbreaking content across 10 safety dimensions.
Endpoint: /v3/guardrails/ftl-safety
Use cases:
Content moderation
Jailbreak prevention
Toxic content detection
Example: Check for Harmful Content
Response Format:
Safety Dimensions:
fdl_harmful- General harmful contentfdl_violent- Violence and threatsfdl_unethical- Unethical behaviorfdl_illegal- Illegal activitiesfdl_sexual- Sexual contentfdl_racist- Racist contentfdl_jailbreaking- Prompt manipulation attemptsfdl_harassing- Harassmentfdl_hateful- Hateful contentfdl_sexist- Sexist content
Interpreting Safety Scores
Each dimension returns a score between 0 and 1:
0.0 - 0.3: Low risk (safe to proceed)
0.3 - 0.7: Medium risk (review recommended)
0.7 - 1.0: High risk (block or flag for review)
🔒 PII Detection
Detect personally identifiable information (PII), protected health information (PHI), and custom sensitive data.
Endpoint: /v3/guardrails/sensitive-information
Use cases:
Data privacy compliance
GDPR/CCPA protection
Sensitive data redaction
Example 1: Detect PII
Response Format:
Response Fields:
score- Confidence score (0.0 to 1.0)label- Entity type (e.g., "email", "social_security_number")text- The detected sensitive informationstart/end- Character positions in the input text
Example 2: Detect PHI (Healthcare Data)
Example 3: Custom Entity Detection
Supported Entity Categories:
PII: 35+ types including names, addresses, SSN, credit cards, emails, phone numbers
PHI: 7 healthcare-specific types (medication, medical conditions, health insurance numbers)
Custom Entities: Define your own sensitive data patterns
Processing PII Results
✅ Faithfulness Detection
Detect hallucinations and unsupported claims by comparing LLM outputs to source context (for RAG applications).
Endpoint: /v3/guardrails/ftl-response-faithfulness
Use cases:
RAG application accuracy
Fact-checking
Hallucination prevention
Example: Check Response Faithfulness
Response Format:
Score Interpretation:
0.0 - 0.3: Low faithfulness (likely hallucination)
0.3 - 0.7: Medium faithfulness (review recommended)
0.7 - 1.0: High faithfulness (response is well-supported by context)
Common Integration Patterns
Pattern 1: Pre-Processing (Input Guardrails)
Check user input before sending to your LLM:
Pattern 2: Post-Processing (Output Guardrails)
Check LLM output before returning to user:
Pattern 3: Complete LLM Pipeline with Multiple Guardrails
Best Practices
Layer Multiple Guardrails: Use safety + PII for inputs, faithfulness + PII for outputs
Set Appropriate Thresholds: Adjust risk score thresholds based on your use case sensitivity
Log All Checks: Track guardrail results for monitoring and continuous improvement
Handle Gracefully: Provide helpful user-facing messages when content is blocked
Monitor Performance: Track false positives/negatives and adjust thresholds accordingly
Consider Latency: Guardrail checks add ~100-300ms - use async calls when possible
Respect Rate Limits: Free tier has limits (2 req/s, 70 req/hr, 200 req/day)
Error Handling
Next Steps
API Reference: Complete Guardrails API Documentation
Setup Guide: Complete Guardrails Setup
Concepts: Guardrails Overview
Monitoring: Integrate Guardrails with LLM Monitoring
Summary
You've learned how to:
✅ Use Safety Guardrails to detect harmful content across 10 dimensions
✅ Detect and redact PII, PHI, and custom sensitive information
✅ Check response faithfulness to prevent hallucinations in RAG applications
✅ Integrate multiple guardrails into your LLM pipeline
✅ Handle errors and respect rate limits
Each guardrail type uses a different endpoint and response format optimized for its specific protection purpose. Combine multiple guardrails for comprehensive LLM application safety.