Guardrails for Proactive Application Protection

Fiddler Guardrails utilizes Fiddler Trust Models in a specialized low-latency, high-throughput configuration. Guardrails can be used to guard Large Language Model (LLM) applications against user threats, such as prompt injection or harmful and inappropriate content, and LLM hallucinations.

Currently, only Fiddler Trust Models (Faithfulness, Safety, and PII) - Fiddler's in-house, purpose-built SLMs - are available for guardrail use. Future model releases and model updates/improvements will also be available for guardrail use.

Getting Started with Fiddler Guardrails

Prerequisites

Access to a Fiddler environment
Valid Fiddler environment API key

Guardrails can be invoked directly via cURL or any HTTP client in your preferred language. Below are sample invocations of Fast Safety and Fast Faithfulness Guardrails.

Fast Safety Guardrails

The Fast Safety model evaluates the safety of the text along ten different dimensions: illegal, hateful, harassing, racist, sexist, violent, sexual, harmful, unethical, jailbreaking.

This model requires a single string input for evaluation and will output ten distinct scores (floats). We recommend setting a threshold value > .1 for detection (any value greater than .1 is unsafe).

Fast Safety Guardrails Example Code

curl --location 'https://{fiddler_endpoint}/v3/guardrails/ftl-safety' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {token}' \
--data '{
    "data": {
        "input": "I am a dangerous person who will be wreaking havoc upon the world!!!"
    }
}'

import requests
import json

token = "YOUR_FIDDLER_TOKEN_HERE"
url = "FIDDLER_ENDPOINT_HERE"

payload = json.dumps(
    {
        "data": {
            "input": "I am a dangerous person who will be wreaking havoc upon the world!!!"
        }
    }
)
headers = {"Content-Type": "application/json", "Authorization": f"Bearer {token}"}

response = requests.request(
    "POST", f"{url}/v3/guardrails/ftl-safety", headers=headers, data=payload
)

print(response.text)

Sample Response

{
  "fdl_harmful": 0.119,
  "fdl_violent": 0.073,
  "fdl_unethical": 0.043,
  "fdl_illegal": 0.016,
  "fdl_sexual": 0.005,
  "fdl_racist": 0.003,
  "fdl_jailbreaking": 0.002,
  "fdl_harassing": 0.001,
  "fdl_hateful": 0.001,
  "fdl_sexist": 0.001
}

Fast Faithfulness Guardrails

The Fast Faithfulness model evaluates the accuracy and reliability of facts presented in AI-generated text responses.

This model requires a response string and contextual documents to evaluate the response as input for evaluation. This model will output a single score (float). We recommend setting a threshold of < .005 for detection (any value less than .005 is unfaithful).

Fast Faithfulness Guardrails Example Code

curl --location 'https://{fiddler_endpoint}/v3/guardrails/ftl-response-faithfulness' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {token}' \
--data '{
    "data": {
      "response": "The Yorkshire Terrier and the Cavalier King Charles Spaniel are both small breeds of companion dogs.",
      "context": "The Yorkshire Terrier is a small dog breed of terrier type, developed during the 19th century in Yorkshire, England, to catch rats in clothing mills.The Cavalier King Charles Spaniel is a small spaniel classed as a toy dog by The Kennel Club and the American Kennel Club"
  }
}'

import requests
import json

token = "YOUR_FIDDLER_TOKEN_HERE"
url = "FIDDLER_ENDPOINT_HERE"

payload = json.dumps(
    {
        "data": {
            "response": "The Yorkshire Terrier and the Cavalier King Charles Spaniel are both small breeds of companion dogs.",
            "context": "The Yorkshire Terrier is a small dog breed of terrier type, developed during the 19th century in Yorkshire, England, to catch rats in clothing mills.The Cavalier King Charles Spaniel is a small spaniel classed as a toy dog by The Kennel Club and the American Kennel Club",
        }
    }
)
headers = {"Content-Type": "application/json", "Authorization": f"Bearer {token}"}

response = requests.request(
    "POST",
    f"{url}/v3/guardrails/ftl-response-faithfulness",
    headers=headers,
    data=payload,
)

print(response.text)

Sample Response

{
  "fdl_faithful_score": 0.045
}

Fast Personally Identifiable Information (PII) Guardrails

The Fast Personally Identifiable Information (PII) model detects, flags, and redacts PII leakage in both user inputs and model responses.

The following labels are supported by the PII model: person, address, email, email address, credit card number, credit card expiration date, cvv, cvc, bank account number, iban, social security number, date of birth, ip address, phone number, mobile phone number, landline phone number, passport number, drivers license number, tax identification number, cpf, cnpj, national health insurance number, digital signature, postal code

This model accepts a single text string and returns all detected PII spans, their labels, confidence scores, and character offsets.

Fast Personally Identifiable Information (PII) Guardrails Example Code

curl --location 'https://{fiddler_endpoint}/v3/guardrails/sensitive-information' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {token}' \
--data '{
    "data": {
        "input": "Some of my colleagues share their contact info as well. Jane Smith's email is [email protected], and her office is located at 432 Oak Avenue, Suite 210, Chicago, IL 60611. You can call her mobile at 312-555-7890."
    }
}'

import requests
import json

token = "YOUR_FIDDLER_TOKEN_HERE"
url = "FIDDLER_ENDPOINT_HERE"

payload = json.dumps(
    {
        "data": {
            "input": "Some of my colleagues share their contact info as well. Jane Smith's email is [email protected], and her office is located at 432 Oak Avenue, Suite 210, Chicago, IL 60611. You can call her mobile at 312-555-7890."
        }
    }
)
headers = {"Content-Type": "application/json", "Authorization": f"Bearer {token}"}

response = requests.request(
    "POST", f"{url}/v3/guardrails/sensitive-information", headers=headers, data=payload
)

print(response.text)

Sample Response

{
  "fdl_sensitive_information_scores": [
    {
      "score": 0.987,
      "label": "email",
      "start": 78,
      "end": 100,
      "text": "[email protected]"
    },
    {
      "score": 0.945,
      "label": "address",
      "start": 131,
      "end": 175,
      "text": "432 Oak Avenue, Suite 210, Chicago, IL 60611"
    },
    {
      "score": 0.987,
      "label": "mobile phone number",
      "start": 204,
      "end": 216,
      "text": "312-555-7890"
    }
  ]
}

❓ Questions? Talk to a product expert or request a demo.

💡 Need help? Contact us at [email protected].

PreviousLLM Evaluation Prompt Specs NextOptimize Your ML Models and LLMs with Fiddler's Comprehensive Monitoring

Last updated 1 month ago

Was this helpful?