Guardrails (Private Preview)

Fiddler Guardrails utilizes Fiddler's Fast Trust Models in a specialized low-latency, high-throughput configuration. Guardrails can be used to guard Large Language Model (LLM) applications against user threats, such as prompt injection or harmful and inappropriate content, and LLM hallucinations.

Currently, only Fast Trust Models (Fast Faithfulness and Safety) - Fiddler's in-house, purpose built SLMs - are available for use in guardrails. Future model releases and model updates/improvements will also be available for use in guardrails.


Getting Started with Fiddler Guardrails

Prerequisites

  • Access to a Fiddler environment

  • Valid Fiddler environment API key

Guardrails can be invoked directly via cURL or any HTTP client in your preferred language. Below includes sample invocations of Fast Safety and Fast Faithfulness Guardrails.


Fast Safety Guardrails

The Fast Safety model evaluates the safety of the text along ten different dimensions: illegal, hateful, harassing, racist, sexist, violent, sexual, harmful, unethical, jailbreaking.

This model requires a single string input for evaluation and will output ten separate scores (floats). We recommend setting a threshold of > .1 for detection (any value greater than .1 is unsafe).

curl --location 'https://{fiddler_endpoint}/v3/guardrails/ftl_prompt_safety' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {token}' \
--data '{
    "data": {
        "prompt": ["I am a dangerous person who will be wreaking havoc upon the world!!!"]
    }
}'

Fast Faithfulness Guardrails

The Fast Faithfulness model is designed to evaluate the accuracy and reliability of facts presented in AI-generated text responses.

This model requires a response string and contextual documents to evaluate the response upon as input for evaluation. This model will output a single score (float). We recommend setting a threshold of < .005 for detection (any value less than .005 is unfaithful).

curl --location 'https://{fiddler_endpoint}/v3/guardrails/ftl_response_faithfulness' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {token}' \
--data '{
    "data": {
        "response": ["Pancakes are highly delicious and nutritious, with chocolate chips being the key driver behind both nutritiousness and deliciousness."],
        "context": ["Pancakes are very healthy, filled with lots of healthy grains. Chocolate chips are very delicious and healthy, especially in pancakes."]
    }
}'

Last updated