Guardrails API Reference
Free Guardrails Rate Limits
The Fiddler Free Guardrails experience is subject to the following rate limits. To increase these limits, please contact sales (https://www.fiddler.ai/contact-sales).
Safety
2
70
200
Faithfulness
2
70
200
Fast PII
2
70
200
Understanding Trust Model Scores
Fiddler Trust Models return scores in the range of 0 to 1. These scores represent the model's confidence that the input belongs to the target class (e.g., toxicity, hallucination).
Higher scores (closer to 1): Higher confidence that the input belongs to the target class
Lower scores (closer to 0): Lower confidence that the input belongs to the target class
Threshold Selection
You must select a threshold value between 0 and 1 to convert these scores into binary decisions. This creates a tradeoff:
Lower thresholds: Catch more true positives but include more false positives
Higher thresholds: Reduce false positives, but might miss some true positives
Our quickstart examples include default thresholds that work well for many applications, but you should adjust these based on your specific requirements and risk tolerance.
Adjusting Your Thresholds
To find the optimal threshold for your use case:
Start with the default threshold
Monitor both missed detections and false alarms
Adjust gradually based on which type of error is more problematic for your application
Safety Model
This Fiddler Trust Model evaluates prompt and response safety across ten dimensions:
Jailbreaking
Illegal content
Hateful content
Harassment
Racism
Sexism
Violence
Sexual content
Harmful content
Unethical content
The model requires a single string input and outputs ten distinct scores (0-1 range). For detailed information, see our official documentation.
How to Use Thresholding
Users can apply thresholds on individual safety dimensions (e.g., harmful, violent, racist) or evaluate all of them collectively. This flexibility allows you to tailor how strictly you filter content based on your unique requirements.
Safety Model OpenAPI Spec
openapi: 3.0.3
info:
title: Fiddler FTL Safety
version: 1.0.0
servers:
- url: "https://{fiddler_endpoint}"
paths:
/v3/guardrails/ftl-safety:
post:
summary: Assess the safety or harmfulness of the provided input text.
operationId: evaluateSafety
security:
- bearerAuth: []
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
data:
type: object
properties:
input:
type: string
description: The text to be evaluated for various safety indicators.
example: "I am a dangerous person who will be wreaking havoc upon the world!!!"
responses:
'200':
description: Successful safety assessment
content:
application/json:
schema:
type: object
properties:
fdl_harmful:
type: number
format: float
description: Likelihood score for harmful content.
fdl_violent:
type: number
format: float
description: Likelihood score for violent content.
fdl_unethical:
type: number
format: float
description: Likelihood score for unethical content.
fdl_illegal:
type: number
format: float
description: Likelihood score for illegal content.
fdl_sexual:
type: number
format: float
description: Likelihood score for sexual content.
fdl_racist:
type: number
format: float
description: Likelihood score for racist content.
fdl_jailbreaking:
type: number
format: float
description: Likelihood score for jailbreaking attempts (prompt manipulation).
fdl_harassing:
type: number
format: float
description: Likelihood score for harassing content.
fdl_hateful:
type: number
format: float
description: Likelihood score for hateful content.
fdl_sexist:
type: number
format: float
description: Likelihood score for sexist content.
'400':
description: Bad request (invalid input data)
'401':
description: Unauthorized (missing or invalid Bearer token)
components:
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWTFaithfulness Model
This Fiddler Trust Model detects hallucinations by evaluating the accuracy and reliability of facts presented in AI-generated text responses in retrieval-augmented generation (RAG) contexts.
The model requires two inputs:
Response: the text generated by your generative application
Context Documents: the reference text that the application response must remain faithful to
The output is a single score (float) representing the factual consistency between the response and the provided context.
How to Use Thresholding
Our quick start examples use a default threshold that works well in various applications. You may wish to set this higher or lower to allow you to tailor how strictly you filter content based on your unique requirements.
Faithfulness Model OpenAPI Spec
openapi: 3.0.3
info:
title: Fiddler FTL Response Faithfulness
version: 1.0.0
servers:
- url: "https://{fiddler_endpoint}"
paths:
/v3/guardrails/ftl-response-faithfulness:
post:
summary: Evaluate the faithfulness of a provided response against given context.
operationId: evaluateFaithfulness
security:
- bearerAuth: []
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
data:
type: object
properties:
response:
type: string
description: The response text to be evaluated.
example: "The Yorkshire Terrier and the Cavalier King Charles Spaniel are both small breeds of companion dogs."
context:
type: string
description: The contextual text to compare against.
example: "The Yorkshire Terrier is a small dog breed... The Cavalier King Charles Spaniel is a small spaniel..."
responses:
'200':
description: Successfully computed faithfulness score
content:
application/json:
schema:
type: object
properties:
fdl_faithful_score:
type: number
format: float
description: A numerical measure indicating how faithful the response is to the given context.
'400':
description: Bad request (invalid payload or missing parameters)
'401':
description: Unauthorized (missing or invalid Bearer token)
components:
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
Fast PII Model (Sensitive Information Detection)
This Fiddler Trust Model detects and identifies sensitive information including personally identifiable information (PII), protected health information (PHI), and custom-defined entities in text data.
The model supports three detection modes:
PII Detection: 35+ entity types including personal, financial, and government identifiers
PHI Detection: 7 healthcare-specific entity types for HIPAA compliance
Custom Entity Detection: Organization-specific sensitive data patterns
The model requires a single string input and outputs an array of detected entities with confidence scores, labels, and text positions.
Key Features
High Performance: 0.1 confidence threshold with top-1024 entity filtering
Comprehensive Coverage: Supports 35+ PII and 7 PHI entity types
Custom Entities: Define organization-specific sensitive patterns
Detailed Output: Returns entity text, type, confidence score, and character positions
Supported PII Entity Types (35+)
Personal Identifiers: person, date_of_birth
Contact Information: email, email_address, phone_number, mobile_phone_number, landline_phone_number, address, postal_code
Financial Data: credit_card_number, credit_card_expiration_date, cvv, cvc, bank_account_number, iban
Government IDs: social_security_number, passport_number, drivers_license_number, tax_identification_number, cpf, cnpj, national_health_insurance_number
Digital Identifiers: ip_address, digital_signature
Supported PHI Entity Types (7)
Medical Information: medication, medical_condition, medical_record_number
Insurance Data: health_insurance_number, health_plan_id
Healthcare Identifiers: birth_certificate_number, device_serial_number
How to Use Thresholding
The Fast PII model uses a default confidence threshold of 0.1, which works well for most applications. Entities with scores above this threshold are considered valid detections. You can adjust this threshold based on your specific requirements:
Lower thresholds (< 0.1): Catch more potential sensitive data but may include more false positives
Higher thresholds (> 0.1): Reduce false positives but might miss some valid sensitive information
Fast PII Model OpenAPI Spec
openapi: 3.0.3
info:
title: Fiddler Fast PII (Sensitive Information Detection)
version: 1.0.0
servers:
- url: "https://{fiddler_endpoint}"
paths:
/v3/guardrails/sensitive-information:
post:
summary: Detect sensitive information (PII, PHI, custom entities) in text
operationId: detectSensitiveInformation
security:
- bearerAuth: []
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
data:
type: object
required:
- input
properties:
input:
type: string
description: The text to analyze for sensitive information
example: "John Doe's SSN is 123-45-6789 and email is [email protected]"
entity_categories:
oneOf:
- type: string
enum: ["PII", "PHI", "Custom Entities"]
- type: array
items:
type: string
enum: ["PII", "PHI", "Custom Entities"]
default: "PII"
description: Entity detection mode(s) to use
example: ["PII", "PHI"]
custom_entities:
type: array
items:
type: string
description: Custom entity patterns (required when using "Custom Entities" mode)
example: ["employee id", "api key", "project code"]
responses:
'200':
description: Successfully detected sensitive information
content:
application/json:
schema:
type: object
properties:
fdl_sensitive_information_scores:
type: array
description: Array of detected sensitive entities
items:
type: object
properties:
score:
type: number
format: float
description: Confidence score (0.0 to 1.0)
example: 0.987
label:
type: string
description: Entity type identifier
example: "social_security_number"
text:
type: string
description: The detected entity text
example: "123-45-6789"
start:
type: integer
description: Character position where entity starts
example: 78
end:
type: integer
description: Character position where entity ends
example: 89
'400':
description: Bad request (invalid input data or missing custom_entities when required)
'401':
description: Unauthorized (missing or invalid Bearer token)
'413':
description: Input exceeds 4096 token limit
components:
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWTFiddler Trust Service Error Codes
400
Invalid Input
Adjust API input to follow above API specification.
401
Invalid Auth Token
The authentication token is invalid or expired. Please double-check your token if it is invalid, and contact [email protected]
404
Invalid guardrail endpoint called
API called must be either ftl-safety (safety model), ftl-response-faithfulness (faithfulness model), or sensitive-information (Fast PII model).
413
Input token length exceeds API token size limits
The safety guardrail has a limit of 4096 tokens. The faithfulness guardrail has a limit of 3500 tokens for the context field, and 350 tokens for the response field. The Fast PII guardrail has a limit of 4096 tokens. These limits are higher in our paid plans.
429
Rate Limits exceeded
The rate limits for the free guardrails experience is 2 requests per second, 70 requests per minute, and 200 requests per day. These limits are higher in our paid plans
500/503/504
Internal Server Error
We are experiencing some internal service errors. Please watch #fiddler-guardrails-support on Slack or contact technical support.
Last updated
Was this helpful?