For a complete reference of all LLM enrichments, see the LLM Observability Metrics Reference.
RAG Health Metrics (LLM-as-a-Judge Evaluators)
RAG Health Metrics are a purpose-built diagnostic triad for evaluating RAG applications. These evaluators use LLM-as-a-Judge approaches and are available in Agentic Monitoring and Experiments:- Answer Relevance 2.0 — Ordinal scoring (High/Medium/Low = 1.0/0.5/0.0) measuring how well the response addresses the query. Also available in LLM Observability.
- Context Relevance — Ordinal scoring measuring whether retrieved documents support the query. Available in Agentic Monitoring and Experiments only.
- RAG Faithfulness — Binary scoring (Yes/No = 1/0) assessing whether the response is grounded in retrieved documents. Also available in LLM Observability.
OpenAI-based metrics
- These metrics are generated through the OpenAI API, which may introduce latency due to network communication and processing time.
- OpenAI API access token MUST BE provided by the user, which will be configured during onboarding.
- The specific model to be used for these metrics will also be chosen during onboarding.
Fiddler Centor Model metrics
- These metrics are generated through Fiddler’s in-house, purpose-built SLMs.
- These metrics can be generated in air-gapped environments and do not rely on any over-the-network connection to generate such scores.
- Centor Safety — Evaluates safety across 11 dimensions including jailbreaking, toxicity, and harmful content.
- Centor Faithfulness — Proprietary Fiddler Centor Model for hallucination detection. Not to be confused with RAG Faithfulness above.