LLM Based Metric

LLM-based metrics use the power of large language models to evaluate the quality of text generated by AI. These metrics go beyond basic checks, understanding the context and nuances of language to assess how relevant, coherent, or creative the text is. This approach is much closer to how humans judge text, making these metrics particularly useful for improving AI-generated content, whether it's for chatbots, writing assistants, or content creation tools. They are an excellent tool for detecting hallucinations.

One of the best things about LLM-based metrics is their flexibility. They can adapt to different topics and types of text because they've been trained on a wide range of information. This adaptability makes them a valuable tool for developers and researchers looking to enhance the quality of AI-generated text, ensuring it meets high standards of clarity, relevance, and engagement. However, choosing the right model for the job is crucial, as it can significantly affect the metrics' effectiveness in providing useful feedback.

Fiddler comes with llm based enrichments like Answer Relevance Faithfulness, Coherence and Conciseness. This list of llm based enrichments will keep expanding.

Requirements:

  • This enrichment requires access to the OpenAI API, which may introduce latency due to network communication and processing time.

  • OpenAI API access token MUST BE provided by the user.

ModelContext Window (tokens)

gpt-3.5-turbo

16,385

gpt-4

8,192

gpt-4-turbo-preview

128,000

gpt-4-0613

8,192

gpt-4-32k

32,768

gpt-4-32k-0613

32,768

reference

https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo

Last updated

© 2024 Fiddler AI