Product Concepts

What is ML Observability?

ML observability is the modern practice of gaining comprehensive insights into your AI application's performance throughout its lifecycle. It goes beyond simple indicators of good and bad performance by empowering all model stakeholders to understand why a model behaves in a certain manner and how to enhance its performance. ML Observability starts with monitoring and alerting on performance issues, but goes much deeper into guiding model owners towards the underlying root cause of model performance issues.

What is LLM Observability?

LLM observability is the practice of evaluating, monitoring, analyzing, and improving Generative AI or LLM based application across their lifecycle. Fiddler provides real-time monitoring on safety metrics like toxicity, bias, and PII and correctness metrics like hallucinations, faithfulness and relevancy.


A project within Fiddler represents your organization's distinct AI applications or use cases. It houses all of the model schemas that have been onboarded to Fiddler for the purpose of AI observability. Projects are typically specific to a given ML application or use case. They serve as a jumping-off point for Fiddler's model monitoring and explainability features.

Additionally, Fiddler projects serve as the main organizational structure within the platform. They are also the entity to which authorization is controlled within the platform. Different users and teams are given access to view information within Fiddler at the project level.

Model Schemas

In Fiddler, A model schema is the metadata about a model that is being observed. Model schemas are onboarded to Fiddler so that Fiddler understand the data in which it is observing. Fiddler does not require the model artifact itself to properly observe the performance of the model (however, model artifacts can be uploaded to Fiddler to unlock advanced explainability features). . Instead, we may just need adequate information about the model's schema, or the model's specification in order to monitor the model.

📘 Working with Model Artifacts

You can upload your model artifacts to Fiddler to unlock high-fidelity explainability for your model. However, it is not required. If you do not wish to upload your artifact but want to explore explainability with Fiddler, we can build a surrogate model on the backend to be used in place of your artifact.

Model Versions

Fiddler offers model versions to organize related models, streamlining processes such as model retraining or conducting champion vs. challenger analyses. When retraining a model, rather than uploading a new model instance, a new version of the existing model can be created, retaining its core structure while accommodating necessary updates. These updates can include modifications to the schema, such as adding or removing columns, modifying data types, adjusting value ranges, updating the model specifications, and even refining task parameters or Explainable AI (XAI) settings.


Within Fiddler, each model has two environments; pre-production and production. Environments help Fiddler uses environments to assign purpose to the data published to it. These environments help Fiddler distinguish between:

  1. Non-time series data (pre-production datasets, eg. training data)

  2. Time-series data (production data, eg. inference logs)

Pre-Production Environment

Pre-production environment contains non-time series chunks of data, called datasets. Datasets are used primarily for point-in-time analysis or as static baselines for comparison against production data.

Production Environment

Production environment contains time series data such as production or inference logs which are the "digital exhaust" coming off of each decision a model makes. This time series data provides the inputs and outputs of each model inference/decision and is what Fiddler analyses and compares against the pre-production data to determine if the model's performance is degrading.


Datasets within Fiddler are static sets of data that have been uploaded to Fiddler for the purpose of establishing "baselines". A common dataset that is uploaded to Fiddler is the model's training data.


Baselines are derived from datasets and used by Fiddler to understand what the expected data distributions that the model will encounter are. A baseline in Fiddler is a set of reference data used to compare the performance of a model for monitoring purposes. The default baseline for all monitoring metrics in Fiddler is typically the model's training data. Additional baselines can be added to existing models as well that are derived from other datasets, historical inferences, or rolling baselines that refer back to data distributions using historical inferences.


Metrics are computations performed on data received by the platform. Fiddler supports the ability to specify custom user-defined metrics (Custom Metrics) in addition to five out-of-the-box metric types:

  • Performance

  • Data Drift

  • Data Integrity

  • Statistic

  • Traffic


Alerts are user-specified rules which trigger when some condition is met by data received on the Fiddler platform. For any alert rule, you can customize how you'd like to be notified (either via email, Slack, or using webhooks).


Segments are "row filters" specified on your data. They allow you to see metrics on only some subset of the data population (e.g. "People under 50"). You can also set alerts on Segments.




Fiddler Monitoring charts offer two distinct types of visualizations: monitoring charts and embedding visualizations.

Monitoring Charts

Monitoring charts provide a comprehensive view of model performance or analyze model to model performance. With intuitive displays for data drift, performance metrics, data integrity, traffic patterns, and more, monitoring charts empower users to maintain model accuracy and reliability with ease.

Embedding Visualizations

Embedding visualizations enhance the interpretability of monitoring charts, providing intuitive representations of complex relationships within high-dimensional data. Condensing custom features into 2D or 3D spaces enables users to efficiently identify patterns, clusters, and outliers, facilitating deeper insights into model behavior and performance.


Comprehensive dashboards consolidate monitoring charts and embedding visualizations, including charts for data drift, data integrity, UMAP, and more. Integrating these charts provides a detailed overview of model performance, empowering teams, management, or stakeholders to make informed, data-driven decisions to enhance AI performance.




Triggered Alerts

Check the UI Guide to Visualize Project Architecture on our User Interface

↪ Questions? Join our community Slack to talk to a product expert

Last updated

© 2024 Fiddler AI