What You’ll Learn
- How to set up Fiddler Guardrails
- How to use the three main guardrail types (Safety, PII, Faithfulness)
- How to interpret risk scores
- How to integrate guardrails into your LLM application
Prerequisites
- Fiddler Guardrails Account: Sign up for Free Guardrails
- API Key: Generated from your Fiddler Guardrails dashboard
- Python 3.8+ (or any HTTP client)
Quick Start: Setting Up Guardrails
Step 1: Get Your API Key
- Sign up at fiddler.ai/free-guardrails
- Activate your account via email
- Generate your API key from the dashboard
Step 2: Install Required Libraries (Optional)
Step 3: Configure Your Connection
Guardrail Types and Usage
Each guardrail type has its own endpoint and request/response format. Choose the appropriate guardrail based on your protection needs.🛡️ Safety Guardrails
Detect harmful, toxic, or jailbreaking content across 10 safety dimensions. Endpoint:/v3/guardrails/ftl-safety
Use cases:
- Content moderation
- Jailbreak prevention
- Toxic content detection
Example: Check for Harmful Content
fdl_harmful- General harmful contentfdl_violent- Violence and threatsfdl_unethical- Unethical behaviorfdl_illegal- Illegal activitiesfdl_sexual- Sexual contentfdl_racist- Racist contentfdl_jailbreaking- Prompt manipulation attemptsfdl_harassing- Harassmentfdl_hateful- Hateful contentfdl_sexist- Sexist contentfdl_roleplaying- Prompting persona change
Interpreting Safety Scores
Each dimension returns a score between 0 and 1:- 0.0 - 0.3: Low risk (safe to proceed)
- 0.3 - 0.7: Medium risk (review recommended)
- 0.7 - 1.0: High risk (block or flag for review)
🔒 PII Detection
Detect personally identifiable information (PII), protected health information (PHI), and custom sensitive data. Endpoint:/v3/guardrails/sensitive-information
Use cases:
- Data privacy compliance
- GDPR/CCPA protection
- Sensitive data redaction
Example 1: Detect PII
score- Confidence score (0.0 to 1.0)label- Entity type (e.g., “email”, “social security number”)text- The detected sensitive informationstart/end- Character positions in the input text
Example 2: Detect PHI (Healthcare Data)
Example 3: Custom Entity Detection
- PII: comprehensive coverage including names, addresses, SSN, credit cards, emails, and phone numbers
- PHI: healthcare-specific types including medication, medical conditions, and health insurance numbers
- Custom Entities: Define your own sensitive data patterns
Processing PII Results
✅ Centor Faithfulness Detection
Detect hallucinations and unsupported claims by comparing LLM outputs to source context (for RAG applications) using Fiddler Centor Models. Endpoint:/v3/guardrails/ftl-response-faithfulness
This guardrail uses the Centor Faithfulness model for real-time content blocking. For RAG pipeline diagnostics using the LLM-as-a-Judge approach, see RAG Health Metrics.
- RAG application accuracy
- Fact-checking
- Hallucination prevention
Example: Check Response Faithfulness
- 0.0 - 0.3: Low faithfulness (likely hallucination)
- 0.3 - 0.7: Medium faithfulness (review recommended)
- 0.7 - 1.0: High faithfulness (response is well-supported by context)
Common Integration Patterns
Pattern 1: Pre-Processing (Input Guardrails)
Check user input before sending to your LLM:Pattern 2: Post-Processing (Output Guardrails)
Check LLM output before returning to user:Pattern 3: Complete LLM Pipeline with Multiple Guardrails
Best Practices
- Layer Multiple Guardrails: Use safety + PII for inputs, faithfulness + PII for outputs
- Set Appropriate Thresholds: Adjust risk score thresholds based on your use case sensitivity
- Log All Checks: Track guardrail results for monitoring and continuous improvement
- Handle Gracefully: Provide helpful user-facing messages when content is blocked
- Monitor Performance: Track false positives/negatives and adjust thresholds accordingly
- Consider Latency: Guardrail checks add ~100-300ms - use async calls when possible
- Respect Rate Limits: Free tier has limits (2 req/s, 70 req/hr, 200 req/day)
Error Handling
Next Steps
- API Reference: Complete Guardrails API Documentation
- Setup Guide: Complete Guardrails Setup
- Concepts: Guardrails Overview
- Tutorials:
- FAQ: Guardrails Frequently Asked Questions
- Monitoring: Integrate Guardrails with LLM Monitoring
Summary
You’ve learned how to:- ✅ Use Safety Guardrails to detect harmful content across 10 dimensions
- ✅ Detect and redact PII, PHI, and custom sensitive information
- ✅ Check response faithfulness to prevent hallucinations in RAG applications
- ✅ Integrate multiple guardrails into your LLM pipeline
- ✅ Handle errors and respect rate limits