Fiddler’s industry-first model analytics, called Slice and Explain, allows you to perform an exploratory or targeted analysis of model behavior:
- Slice: Identify a selection, or slice, of data. Or, you can start with the entire dataset for global analysis.
- Explain: Analyze model behavior on that slice using Fiddler’s visual Explanations and other data insights
Slice and Explain is designed to help data scientists, model validators, or analysts drill down into a model and dataset and see global, local, and instance-level explanations for the model’s predictions.
Slice and Explain can help you answer questions like:
- What are the key drivers of my model output in a subsection of the data?
- How are the model inputs correlated to other inputs and to the output?
- Where is my model underperforming?
- How is my model performing across the classes in a protected group?
Access Slice and Explain from the Analyze tab for your model. The current version of Slice and Explain supports Tabular models only —NLP support is coming soon.
The Analyze tab has three parts:
- Slice Query box: Accepts a SQL query as input to quickly access the slice.
- Data table: Lets you browse instances of data returned from the query.
- Explanations column: On the right hand side, you can view explanations for the slice and choose from a range of rich visualizations for different data insights.
- Write a SQL query in Slice Query box and click on Run.
- View the data returned by the query in the Data table.
- Use the right hand column to explore Explanations using visualizations.
The SQL query box lets you
- Write a SQL query
- Search and autocomplete a schema (i.e. your dataset, the names of your inputs or outputs)
- Run a SQL query
In the UI, you will see examples for different types of queries:
- Example query to analyze your dataset:
select * from "your_dataset_id" limit 100
- Example query to analyze your model:
select * from "your_dataset_id.your_model_id" limit 100
- Example query to analyze production traffic:
select * FROM production."your_model_id" where fiddler_timestamp between '2020-10-20 00:00:00' AND '2020-10-20 12:00:00'limit 100
Only read-only SQL operations are supported. Slices are auto-detected based on your model, dataset, and query. Certain SQL operations like aggregations and joins might not result in a slice.
If the query successfully returns a slice, the results display in the Data table below the Slice Query box.
You can view all data rows and their values or download the data as a CSV file to plug it into another system. By clicking on Explain or Fiddle in any row in the table, you can access explanations for that individual input (more on this in the next section)
The Analyze tab offers a variety of powerful visualizations to quickly let you analyze and explain slices of your dataset.
- Dataset Details - Analyze statistical qualities of the dataset.
- Feature Impact - Understand the aggregate impact of model inputs to the output.
- Slice Evaluation - View the model metrics for a given slice.
- Feature Correlation - View the correlation between model inputs and/or outputs.
- Feature Distribution - Visualize the distribution of an input or output.
- Partial Dependence Plot - Understand the aggregate impact of a single model input in its output.
- Instance Explanation - Understand the impact of the model input on one model output instance. Accessed by clicking on Explain for a row in the Data table.
- Fiddle (or WhatIF) Plot – Predict how changes in the model’s input values might impact the model’s output. Accessed by clicking on Fiddle for a row in the Data table.
This visualization provides statistical details of the dataset to help you understand the data’s distribution and correlations.
Select a target to see the dependence between that input with the model output, measured by mutual information (MI). A low MI is an indicator of low correlation between that input to the model output and can be used to decide if the input should be dropped from the model.
The Feature Impact of the dataset (global explanation) or slice (local explanation) gives the overall sensitivity of the model output to each feature (more on this in the Global Explainability section). We calculate Feature Impact by randomly intervening on every input using ablations and noting the average absolute change in the prediction.
A high impact suggests that the model’s behavior on this slice is more sensitive to changes to the feature. Feature Impact only provides the absolute impact of the input and not its directionality. Since positive and negative directionality can cancel out, we recommend using a Partial Dependence Plot to understand how an input impacts the output in aggregate.
The slice evaluation visualization gives you key model performance metrics and plots, which can be helpful to identify performance issues or model bias on protected classes. In addition to key metrics, you get a confusion matrix along with precision recall, ROC, and calibration plots. This visualization supports classification, regression, and multi-class models.
The feature correlation visualization plots a single input vs either another input or the model’s output. This plot helps identify any visual clusters that might be useful to analyze further. This visualization supports integer, float, and categorical variables.
The feature distribution visualization is one the most basic machine learning plots to view how the data is distributed for a particular input. This plot helps surface any data abnormalities or data insights to help root-cause issues or drive further analysis.
Partial Dependence Plot (PDP)¶
The Partial Dependence Plot (PDP) shows the marginal effect of the selected input on the model output. This plot helps understand whether the relationship between the input and the output is linear, monotonic, or more complex.
To view this visualization, click on Explain in any row in the Data table. The Point Explanation looks at the selected instance and helps you understand the impact of the model’s input on the output. You will see the Feature Attribution (more on this in the Point Explainability section) showing how each input affected the prediction, and you will also see the input’s position in the feature distribution for the overall dataset.
When you want to check how the model is behaving for one prediction instance, use this visualization first. Read more about the optimized techniques we use in the Point Explainability section.
Overview This chart provides human readable overview of a point explanation.
Fiddler (or What-if) Plot¶
To view this visualization, click on Fiddle in any row in the Data table. The Fiddler Plot helps you understand how changes in the model’s input values could impact the model’s prediction for this instance.
On initial load, the visualization shows an Individual Conditional Expectation (ICE) plot for each model input.
The ICE plot shows how the model prediction is affected by the changes in the input for that single instance. It’s computed by changing the value of the input being inspected while keeping all other inputs constant, and plotting the resulting predictions.
Recall the PDP (Partial Dependence Plot) visualization discussed earlier, which showed the average effect of the feature across the entire slice. In essence, the PDP is the average of all the ICE plots. The PDP can mask interactions at the instance level, which an ICE plot will capture.
How to “Fiddle” with the plot We call this a “WhatIF” or “Fiddle” plot because you can update any input value to see its impact on the model output, and then view the updated ICE plots for the changed input values.
This is a powerful technique for performing counterfactual analysis of an instance in the context of the model. When you plot the updated ICE plots, you see two lines (or sets of bars in the case of categorical inputs). In the image below, the solid line is the original ICE plot, and the dotted line is the ICE plot using the updated input values. Comparing these two sets of plots can help you understand if the model behavior changes as expected with the hypothetical model input.
Once visualizations are created, you can pin them to the project dashboard, which can be shared with others.
To pin a chart, you can click on the thumbtack icon and then select which dashboard to pin it to. If the ‘Update with Query’ option is enabled, the pinned chart will update automatically whenever the underlying query is changed in Slice & Explain.