Integrate Fiddler with Databricks for Model Monitoring and Explainability
Last updated
Was this helpful?
Last updated
Was this helpful?
Fiddler allows your team to monitor, explain and analyze your models developed and deployed in by integrating with for model asset management and utilizing Databricks Spark environment for data management.
To validate and monitor models built on Databricks using Fiddler, you can follow these steps:
Create a Fiddler
Create a Fiddler using sample data or model information from MLflow
Publish production data live or in
This guide assumes you have:
A Databricks account and valid credentials
A Fiddler environment with an account and valid credentials
Know how to use the Fiddler Python client
Launch a from your workspace and run the following code:
Now that you have the Fiddler library installed, you can connect to your Fiddler environment. You will need your authentication token from the tab in Application Settings.
Quickest Option: Let Fiddler Automate Model Creation
Option: Using the MLflow Model Registry
Now you can publish all the events from your models. You can do this in two ways:
Batch Models
If your models run batch processes with your models or your aggregate model outputs over a time frame, then you can use the table change feed from Databricks to select only the new events and send them to Fiddler:
Live Models
For models with live predictions or real-time applications, you can add the following code snippet to your prediction pipeline and send every event to Fiddler in real-time:
Finally, you can set up a new using:
The quickest way to onboard a Fiddler model is to get a sample of data from which Fiddler can infer model schema and metadata. Ideally you will have baseline, testing, or training data that is representative of your model schema. Fiddler can infer your model schema from this sample dataset. You can download baseline or training data from a and share it with Fiddler as a baseline dataset:
Now that you have sample data, you can create a Fiddler model easily as documented and demonstrated in our . A rough outline of the steps follow:
Another option is manually construct your model's schema from the details contained in the MLflow registry. Using the you can query the model registry and get the model signature which describes the inputs and outputs as a dictionary. You can use this dictionary to build out the , , and objects which defines the tabular schema of your model.
Refer to this in GitHub which demonstrates manually defining your Fiddler model's schema.