API Methods 3.x
Alerts
AlertRule
AlertRule object contains the below fields.
id
UUID
-
Unique identifier of the AlertRule.
name
str
-
Unique name of the AlertRule.
model
-
The associated model details.
project
-
The associated project details
baseline
None
The associated baseline.
segment
None
Details of segment for the alert.
priority
-
To set the priority for the AlertRule. Select from: 1. Priority.LOW 2. Priority.MEDIUM 3. Priority.HIGH.
compare_to
-
Select from the two: 1. CompareTo.RAW_VALUE 2. CompareTo.TIME_PERIOD
metric_id
Union[str, UUID]
-
critical_threshold
float
-
Critical alert is triggered when this value satisfies the condition to the selected metric_id.
condition
-
Select from: 1. AlertCondition.LESSER 2. AlertCondition.GREATER
bin_size
-
Bin size for example fdl.BinSize.HOUR.
columns
Optional[List[str]]
None
List of 1 or more column names for the rule to evaluate. Use ['__ANY__'] to evaluate all columns.
baseline_id
Optional[UUID]
None
UUID of the baseline for the alert.
segment_id
Optional[UUID]
None
UUID of segment for the alert
compare_bin_delta
Optional[int]
None
Indicates previous period for comparison e.g. for fdl.BinSize.DAY, compare_bin_delta=1 will compare 1 day back, compare_bin_delta=7 will compare 7 days back.
warning_threshold
Optional[float]
None
Warning alert is triggered when this value satisfies the condition to the selected metric_id.
created_at
datetime
-
The creation timestamp.
updated_at
datetime
-
The timestamp of most recent update.
evaluation_delay
int
0
Specifies a delay in hours before AlertRule is evaluated. The delay period must not exceed one year(8760 hours).
constructor()
Initialize a new AlertRule on Fiddler Platform.
Parameters
name
str
-
Unique name of the model
model_id
UUID
-
Details of the model.
metric_id
Union[str, UUID]
-
Type of alert metric UUID or enum.
columns
Optional[List[str]]
None
List of column names on which AlertRule is to be created. It can take ['__ANY__'] to check for all columns.
baseline_id
Optional[UUID]
None
UUID of the baseline for the alert.
segment_id
Optional[UUID]
None
UUID of the segment for the alert.
priority
-
To set the priority for the AlertRule. Select from: 1. Priority.LOW 2. Priority.MEDIUM 3. Priority.HIGH.
compare_to
-
Select from the two: 1. CompareTo.RAW_VALUE (absolute alert) 2. CompareTo.TIME_PERIOD (relative alert)
compare_bin_delta
Optional[int]
None
Compare the metric to a previous time period in units of bin_size.
warning_threshold
Optional[float]
None
Threshold value to crossing which a warning level severity alert will be triggered.
critical_threshold
float
-
Threshold value to crossing which a critical level severity alert will be triggered.
condition
-
Select from: 1. AlertCondition.LESSER 2. AlertCondition.GREATER
bin_size
-
Size of the bin for AlertRule.
evaluation_delay
int
0
To introduce a delay in the evaluation of the alert, specifying the duration in hours. The delay period must not exceed one year(8760 hours).
Usage
create()
Create a new AlertRule.
Parameters
No
Usage
Returns
AlertRule instance.
get()
Get a single AlertRule.
Parameters
id_
UUID
-
Unique identifier for the AlertRule.
Usage
Returns
AlertRule instance.
Raises
NotFound
AlertRule with given identifier not found.
Forbidden
Current user may not have permission to view details of AlertRule.
list()
Get a list of AlertRules .
Parameters
model_id
Union[str, UUID]
None
Unique identifier for the model to which AlertRule belongs.
project_id
Optional[UUID]
None
Unique identifier for the project to which AlertRule belongs
metric_id
Optional[UUID]
None
Type of alert metric UUID or enum.
columns
Optional[List[str]]
None
List of column names on which AlertRule is to be created. It can take ['ANY'] to check for all columns.
baseline_id
Optional[UUID]
None
UUID of the baseline for the AlertRule.
ordering
Optional[List[str]]
None
List of AlertRule fields to order by. Eg. [‘alert_time_bucket’] or [‘- alert_time_bucket’] for descending order.
Usage
Returns
Iterator of AlertRule instances.
delete()
Delete an existing AlertRule.
Parameters
id_
UUID
-
Unique UUID of the AlertRule .
Usage
Returns
No
Raises
NotFound
AlertRule with given identifier not found.
Forbidden
Current user may not have permission to view details of AlertRule.
enable_notifications()
Enable an AlertRule's notification.
Parameters
id_
UUID
-
Unique UUID of the AlertRule .
Usage
Returns
None
Raises
NotFound
AlertRule with given identifier not found.
Forbidden
Current user may not have permission to view details of AlertRule.
disable_notifications()
Disable notifications for an AlertRule.
Parameters
id_
UUID
-
Unique UUID of the AlertRule .
Usage
Returns
None
Raises
NotFound
AlertRule with given identifier not found.
Forbidden
Current user may not have permission to view details of AlertRule.
Alert Notifications
Alert notifications for an AlertRule.
emails
Optional[List[str]]
None
List of emails to send notification to.
PagerDuty_services
Optional[List[str]]
None
List of PagerDuty services to trigger the alert to.
PagerDuty_severity
Optional[str]
None
Severity of PagerDuty.
webhooks
Optional[List[UUID]]
None
set_notification_config()
Set NotificationConfig for an AlertRule.
Parameters
emails
Optional[List[str]]
None
List of emails to send notification to.
PagerDuty_services
Optional[List[str]]
None
List of PagerDuty services to trigger the alert to.
PagerDuty_severity
Optional[str]
None
Severity of PagerDuty.
webhooks
Optional[List[UUID]]
None
Usage
Returns
NotificationConfig
Alert notification settings for an AlertRule.
If
PagerDuty_severity
is passed without specifyingPagerDuty_services
then thePagerDuty_severity
is ignored.
Raises
BadRequest
All 4 input parameters are empty.
ValueError
Webhook ID is incorrect.
get_notification_config()
Get notification configuration for an AlertRule.
Parameters
None
Usage
Returns
NotificationConfig
Alert notification settings for an AlertRule.
Raises
BadRequest
All 4 input parameters are empty.
ValueError
Webhook ID is incorrect.
Triggered Alerts
AlertRecord
An AlertRecord details an AlertRule's triggered alert.
id
UUID
-
Unique identifier for the triggered AlertRule.
alert_rule_id
UUID
-
Unique identifier for the AlertRule which needs to be triggered.
alert_run_start_time
int
-
Timestamp of AlertRule evaluation in epoch.
alert_time_bucket
int
-
Timestamp pointing to the start of the time bucket in epoch.
alert_value
float
-
Value of the metric for alert_time_bucket.
baseline_time_bucket
Optional[int]
None
Timestamp pointing to the start of the baseline time bucket in epoch, only if AlertRule is of 'time period' based comparison.
baseline_value
Optional[float]
None
Value of the metric for baseline_time_bucket.
is_alert
bool
-
Boolean to indicate if alert was supposed to be triggered.
severity
str
-
failure_reason
str
-
String message if there was a failure sending notification.
message
str
-
String message sent as a part of email notification.
feature_name
Optional[str]
None
Name of feature for which alert was triggered.
alert_record_main_version
int
-
Main version of triggered alert record in int, incremented when the value of severity changes.
alert_record_sub_version
int
-
Sub version of triggered alert record in int, incremented when another alert with same severity as before is triggered.
created_at
datetime
-
Time at which trigger AlertRule was created.
updated_at
datetime
-
Latest time at which trigger AlertRule was updated.
list()
List AlertRecords triggered for an AlertRule.
Parameters
alert_rule_id
UUID
-
Unique identifier for the AlertRule which needs to be triggered.
start_time
Optional[datetime]
None
Start time to filter trigger alerts in yyyy-MM-dd format, inclusive.
end_time
Optional[datetime]
None
End time to filter trigger alerts in yyyy-MM-dd format, inclusive.
ordering
Optional[List[str]]
None
List of AlertRule fields to order by. Eg. [‘alert_time_bucket’] or [‘- alert_time_bucket’] for descending order.
Usage
Returns
Iterable of triggered AlertRule instances for an AlertRule.
Baselines
Baseline datasets are used for making comparisons with production data.
A baseline dataset should be sampled from your model's training set, so it can serve as a representation of what the model expects to see in production.
Baseline
Baseline object contains the below fields.
id
UUID
-
Unique identifier for the baseline.
name
str
-
Baseline name.
type_
-
Baseline type can be static (Pre-production or production) or rolling(production).
start_time
Optional[int]
None
Epoch to be used as start time for STATIC baseline.
end_time
Optional[int]
None
Epoch to be used as end time for STATIC baseline.
offset
Optional[int]
None
Offset in seconds relative to current time to be used for ROLLING baseline.
window_size
Optional[int]
None
Span of window in seconds to be used for ROLLING baseline.
row_count
Optional[int]
None
Number of rows in baseline.
model
-
Details of the model.
project
-
Details of the project to which the baseline belongs.
dataset
-
Details of the dataset from which baseline is derived.
created_at
datetime
-
Time at which baseline was created.
updated_at
datetime
-
Latest time at which baseline was updated.
constructor()
Initialize a new baseline instance.
Parameters
name
str
-
Unique name of the baseline.
model_id
UUID
-
Unique identifier for the model to add baseline to.
environment
-
Type of environment. Can either be PRE_PRODUCTION or PRODUCTION.
type_
-
Baseline type can be static (pre-production or production) or rolling(production).
dataset_id
Optional[UUID]
None
Unique identifier for the dataset on which the baseline is created.
start_time
Optional[int]
None
Epoch to be used as start time for STATIC baseline.
end_time
Optional[int]
None
Epoch to be used as end time for STATIC baseline.
offset_delta
Optional[int]
None
window_bin_size
Optional[str]
None
Usage
create()
Adds a baseline to Fiddler.
Parameters
No
Usage
Returns
Baseline instance.
Raises
Conflict
Baseline with same name may exist in project .
NotFound
Given dataset may not exist in for the input model.
ValueError
Validation failures like wrong window size, start_time, end_time etc
get()
Get baseline from Fiddler Platform based on UUID.
Parameters
id_
UUID
-
Unique identifier for the baseline.
Usage
Returns
Baseline instance.
Raises
NotFound
Baseline with given identifier not found.
Forbidden
Current user may not have permission to view details of baseline.
from_name()
Get baseline from Fiddler Platform based on name.
Parameters
name
str
-
Name of the baseline.
model_id
UUID | str
-
Unique identifier for the model.
Usage
Returns
Baseline instance.
Raises
NotFound
Baseline with given identifier not found.
Forbidden
Current user may not have permission to view details of baseline.
list()
List all baselines accessible to user.
Parameters
model_id
UUID
-
UUID of the model associated with baseline.
Usage
Returns
Iterable of all baseline objects.
Raises
Forbidden
Current user may not have permission to view details of baseline.
delete()
Deletes a baseline.
Parameters
id_
UUID
-
Unique UUID of the baseline .
Usage
Returns
None
Raises
NotFound
Baseline with given identifier not found.
Forbidden
Current user may not have permission to delete baseline.
Custom Metrics
User-defined metrics to extend Fiddler's built-in metrics.
CustomMetric
CustomMetric object contains the below parameters.
id
UUID
-
Unique identifier for the custom metric.
name
str
-
Custom metric name.
model_id
UUID
-
UUID of the model in which the custom metric is being added.
definition
str
-
Definition of the custom metric.
description
Optional[str]
None
Description of the custom metric.
created_at
datetime
-
Time of creation of custom metric.
constructor()
Initialize a new custom metric.
Parameters
name
str
-
Custom metric name.
model_id
UUID
-
UUID of the model in which the custom metric is being added.
definition
str
-
Definition of the custom metric.
description
Optional[str]
None
Description of the custom metric.
Usage
get()
Get CustomMetric from Fiddler Platform based on model UUID.
Parameters
model_id
UUID
-
UUID of the model associated with the custom metrics.
Usage
Returns
Iterable of all custom metric objects.
Raises
Forbidden
Current user may not have permission to view details of custom metric.
from_name()
Get CustomMetric from Fiddler Platform based on name and model UUID.
Parameters
name
str
-
Name of the custom metric.
model_id
UUID | str
-
Unique identifier for the model.
Usage
Returns
Custom Metric instance.
Raises
NotFound
Custom metric with given identifier not found.
Forbidden
Current user may not have permission to view details of custom metric.
create()
Creates a custom metric for a model on Fiddler Platform.
Parameters
None
Usage
Returns
Custom Metric instance.
Raises
Conflict
Custom metric with same name may exist in project .
BadRequest
Invalid definition.
NotFound
Given model may not exist.
delete()
Delete a custom metric.
Parameters
id_
UUID
-
Unique UUID of the custom metric.
Usage
Returns
No
Raises
NotFound
Custom metric with given identifier not found.
Forbidden
Current user may not have permission to delete custom metric.
Datasets
Datasets (or baseline datasets) are used for making comparisons with production data.
Dataset
Dataset object contains the below parameters.
id
UUID
-
Unique identifier for the dataset.
name
str
-
Dataset name.
row_count
int
None
Number of rows in dataset.
model_id
-
Unique identifier of the associated model
project_id
-
Unique identifier of the associated project
get()
Get dataset from Fiddler Platform based on UUID.
Parameters
id_
UUID
-
Unique identifier for the dataset.
Usage
Returns
Dataset instance.
Raises
NotFound
Dataset with given identifier not found.
Forbidden
Current user may not have permission to view details of dataset.
from_name()
Get dataset from Fiddler Platform based on name and model UUID.
Usage params
name
str
-
Name of the dataset.
model_id
UUID | str
-
Unique identifier for the model.
Usage
Returns
Dataset instance.
Raises
NotFound
Dataset not found in the given project name.
Forbidden
Current user may not have permission to view details of dataset.
list()
Get a list of all datasets associated to a model.
Parameters
model_id
UUID
-
UUID of the model associated with baseline.
Usage
Returns
Iterable of all dataset objects.
Raises
Forbidden
Current user may not have permission to view details of dataset.
Jobs
A Job is used to track asynchronous processes such as batch publishing of data.
Job
Job object contains the below fields.
id
UUID
-
Unique identifier for the job.
name
str
-
Name of the job.
status
str
-
Current status of job.
progress
float
-
Progress of job completion.
info
dict
-
Dictionary containing resource_type, resource_name, project_name.
error_message
Optional[str]
None
Message for job failure, if any.
error_reason
Optional[str]
None
Reason for job failure, if any.
extras
Optional[dict]
None
Metadata regarding the job.
get()
Get the job instance using job UUID.
Parameters
id_
UUID
-
Unique UUID of the project to which model is associated.
verbose
bool
False
Flag to get extras
metadata about the tasks executed.
Usage
Returns
Single job object for the input params.
Raises
Forbidden
Current user may not have permission to view details of job.
wait()
Wait for job to complete either with success or failure status.
Parameters
interval
Optional[int]
3
Interval in seconds between polling for job status.
timeout
Optional[int]
1800
Timeout in seconds for iterator to stop.
Usage
Returns
Single job object for the input params.
Raises
Forbidden
Current user may not have permission to view details of job.
TimeoutError
When the default time out of 1800 secs.
watch()
Watch job status at given interval and yield job object.
Parameters
interval
Optional[int]
3
Interval in seconds between polling for job status.
timeout
Optional[int]
1800
Timeout in seconds for iterator to stop.
Usage
Returns
Iterator of job objects.
Raises
Forbidden
Current user may not have permission to view details of job.
TimeoutError
When the default time out of 1800 secs.
Models
A Model is a representation of your machine learning model which can be used for monitoring, explainability, and more. You do not need to upload your model artifact in order to onboard your model, but doing so will significantly improve the quality of explanations generated by Fiddler.
Model
Model object contains the below parameters.
id
UUID
-
Unique identifier for the model.
name
str
-
Unique name of the model (only alphanumeric and underscores are allowed).
input_type
ModelInputType.TABULAR
Input data type used by the model.
task
ModelTask.NOT_SET
Task the model is designed to address.
task_params
-
Task parameters given to a particular model.
schema
-
Model schema defines the details of each column.
version
Optional[str]
-
Unique version name within a model
spec
-
Model spec defines how model columns are used along with model task.
description
str
-
Description of the model.
event_id_col
str
-
Column containing event id.
event_ts_col
str
-
Column containing event timestamp.
xai_params
-
Explainability parameters of the model.
artifact_status
str
-
Artifact Status of the model.
artifact_files
list[dict]
-
Dictionary containing file details of model artifact.
is_binary_ranking_model
bool
-
created_at
datetime
-
Time at which model was created.
updated_at
datetime
-
Latest time at which model was updated.
created_by
-
Details of the who created the model.
updated_by
-
Details of the who last updated the model.
project
-
Details of the project to which the model belongs.
organization
-
Details of the organization to which the model belongs.
constructor()
Initialize a new model instance.
Usage
Parameters
name
str
-
Unique name of the model
project_id
UUID
-
Unique identifier for the project to which model belongs.
input_type
ModelInputType.TABULAR
Input data type used by the model.
task
ModelTask.NOT_SET
Task the model is designed to address.
schema
-
Model schema defines the details of each column.
spec
-
Model spec defines how model columns are used along with model task.
version
Optional[str]
-
Unique version name within a model
task_params
-
Task parameters given to a particular model.
description
str
-
Description of the model.
event_id_col
str
-
Column containing event id.
event_ts_col
str
-
Column containing event timestamp.
xai_params
-
Explainability parameters of the model.
from_data()
Build model instance from the given dataframe or file(csv/parquet).
Parameters
source
pd.DataFrame | Path | str
-
Pandas dataframe or path to csv/parquet file
name
str
-
Unique name of the model
project_id
UUID | str
-
Unique identifier for the project to which model belongs.
input_type
ModelInputType.TABULAR
Input data type used by the model.
task
ModelTask.NOT_SET
Task the model is designed to address.
spec
-
Model spec defines how model columns are used along with model task.
version
Optional[str]
-
Unique version name within a model
task_params
-
Task parameters given to a particular model.
description
Optional[str]
-
Description of the model.
event_id_col
Optional[str]
-
Column containing event id.
event_ts_col
Optional[str]
-
Column containing event timestamp.
xai_params
-
Explainability parameters of the model.
max_cardinality
Optional[int]
None
Max cardinality to detect categorical columns.
sample_size
Optional[int]
-
No. of samples to use for generating schema.
Usage
Returns
Model instance.
Notes
from_data
will not create a model entry on Fiddler Platform. Instead this method only returns a model instance which can be edited, call.create()
to onboard the model to Fiddler Platform.spec
is optional tofrom_data
method. However, aspec
with at leastinputs
is required for model onboarding.Make sure
spec
is passed tofrom_data
method if model requires custom features. This method generates centroids which are needed for custom feature drift computationIf
version
is not explicitly passed, Fiddler Platform will treat it asv1
version of the model.
create()
Onboard a new model to Fiddler Platform
Parameters
No
Usage
Returns
Model instance.
Raises
Conflict
Model with same name may exist in project .
get()
Get model from Fiddler Platform based on UUID.
Parameters
id_
UUID | str
-
Unique identifier for the model.
Returns
Model instance.
Raises
NotFound
Model with given identifier not found.
Forbidden
Current user may not have permission to view details of model.
Usage
from_name()
Get model from Fiddler Platform based on name and project UUID.
Parameters
name
str
-
Name of the model.
project_id
UUID | str
-
Unique identifier for the project.
version
Optional[str]
-
Unique version name within a model
version
parameter is available fromfiddler-client==3.1
onwards
Usage
Returns
Model instance.
Notes
When the version is not passed, then the model created without any version will be fetched. Fiddler internally assigns version=v1 when not passed.
When the version is passed, method will fetch the model corresponding to that specific version.
Raises
NotFound
Model not found in the given project name.
Forbidden
Current user may not have permission to view details of model.
list()
Gets all models of a project.
Parameters
project_id
UUID | str
-
Unique UUID of the project to which model is associated.
name
Optional[str]
-
Model name. Pass this to fetch all versions of a model.
Returns
Iterable of model compact objects.
Errors
Forbidden
Current user may not have permission to the given project.
Usage example
Notes
Since
Model
contains a lot of information, list operations does not return all the fields of a model. Instead this method returnsModelCompact
objects on which.fetch()
can be called to get the completeModel
instance. For most of the use-cases,ModelCompact
objects are sufficient.
update()
Update an existing model. Only following fields are allowed to be updated, backend will ignore if any other field is updated on the instance.
Parameters
version
Optional[str]
None
Model version name
xai_params
None
Explainability parameters of the model.
description
Optional[str]
None
Description of the model.
event_id_col
Optional[str]
None
Column containing event id.
event_ts_col
Optional[str]
None
Column containing event timestamp.
version
parameter is available fromfiddler-client==3.1
onwards
Usage
Returns
No
Raises
BadRequest
If field is not updatable.
duplicate()
Duplicate the model instance with the given version name.
This call will not save the model on Fiddler Platform. After making changes to the model instance, call .create()
to add the model version to Fiddler Platform.
Added in version 3.1.0
Parameters
version
Optional[str]
None
Model version name
Usage
Returns
Model instance.
Raises
BadRequest
If field is not updatable.
remove_column()
Remove column from model object(in-place)
Modifies the model object in-place by removing the column with name column_name. Do this before uploading the model to the Fiddler Platform (which can be done with the create() method), otherwise the change does not take effect.
Added in version 3.7.0
Parameters
column_name
str
-
Name of the column to be removed
missing_ok
bool
-
If False, raises an error if column is not found
Usage
Returns
None
Raises
KeyError
If column is not present and missing_ok is False.
delete()
Delete a model.
Parameters
No
Usage
Returns
Async job details for the delete job.
Notes
Model deletion is an async process, hence a job object is returned on
delete()
call. Calljob.wait()
to wait for the job to complete. If you are planning to create a model with the same name, please wait for the job to complete, otherwise backend will not allow new model with same name.
add_surrogate()
Add surrogate existing model.
Parameters
dataset_id
UUID | str
-
Dataset identifier
deployment_params
-
Model deployment parameters.
Usage
Returns
Async job details for the add surrogate job.
Raises
BadRequest
Invalid deployment parameter is passed
update_surrogate()
Update surrogate existing model.
Parameters
dataset_id
UUID | str
-
Dataset identifier
deployment_params
None
Model deployment parameters.
Usage
Returns
Async job details for the update surrogate job.
add_artifact()
Add artifact files to existing model.
Parameters
model_dir
str
-
Path to directory containing artifacts for upload.
deployment_params
None
Model deployment parameters.
Usage
Returns
Async job details for the add artifact job.
update_artifact()
Update existing artifact files in a model.
Parameters
model_dir
str
-
Path to directory containing artifacts for upload.
deployment_params
None
Model deployment parameters.
Usage
Returns
Async job details for the add artifact job.
download_artifact()
Download existing artifact files in a model.
Parameters
output_dir
str
-
Path to directory to download the artifacts.
Usage
Returns
No
Properties
datasets
List all datasets associated with a model.
Parameters
No
Usage
Returns
Iterable of dataset instances.
Raises
Forbidden
Current user may not have permission to view details of model.
model_deployment
Get the model deployment object associated with the model.
Parameters
No
Usage
Returns
Model deployment instance.
Raises
NotFound
Model with given identifier not found.
Forbidden
Current user may not have permission to view details of model.
publish()
Publish Pre-production or production events.
Parameters
source
Union[list[dict[str, Any]], str, Path, pd.DataFrame]
-
Source can be: 1. Path or str path: path for data file. 2. list[dict]: list of event dicts. EnvType.PRE_PRODUCTION not supported. 3. dataframe: events dataframe.
environment
EnvType
EnvType.PRODUCTION
Either EnvType.PRE_PRODUCTION or EnvType.PRODUCTION
dataset_name
Optional[str]
None
Name of the dataset. Not supported for EnvType.PRODUCTION
update
Optional[bool]
False
Usage
Pre-requisite
Publish dataset (pre-production data) from file
Publish dataset (pre-production data) from dataframe
Publish production events from list
List is only supported for production data but not for pre-production.
Events are published as a stream. This mode is recommended If you have a high volume of continuous real-time traffic of events, as it allows for more efficient processing on our backend.
It returns a list of event_id
for each of the published events.
Notes
In this example where
model.event_id_col
=event_id
, we expectevent_id
as the required key of the dictionary. Otherwise if you keepmodel.event_id_col=None
, our backend will generate unique event ids and return these back to you. Same formodel.event_ts_col
, we assign current time as event timestamp in case ofNone
.
Publish production events from file
Batch events is faster if you want to publish a large-scale set of historical data.
Publish production events from dataframe
Update events
if you need to update the target or metadata columns for a previously published production event, set update
=True. For more details please refer to Updating Events. Note only production events can be updated.
Update production events from list
Update production events from dataframe
Returns
In case of streaming publish
list[UUID|str]
list[dict]
List of event identifier
In case of batch publish
Union[str, Path, pd.DataFrame]
Job object for file/dataframe published
Model Compact
Model object contains the below parameters.
id
UUID
-
Unique identifier for the model.
name
str
-
Unique name of the model
version
Optional[str]
-
Unique version name within a model
fetch()
Fetch the model instance from Fiddler Platform.
Parameters
No
Returns
Model instance.
Raises
NotFound
Model not found for the given identifier
Forbidden
Current user may not have permission to view details of model.
Model deployment
Get model deployment object of a particular model.
Model deployment:
Model deployment object contains the below parameters.
id
UUID
-
Unique identifier for the model.
model
-
Details of the model.
project
-
Details of the project to which the model belongs.
organization
-
Details of the organization to which the model belongs.
artifact_type
-
Task the model is designed to address.
deployment_type
-
Type of deployment of the model.
image_uri
Optional[str]
md-base/python/python-311:1.0.0
active
bool
True
Status of the deployment.
replicas
Optional[str]
1
The number of replicas running the model. Minimum value: 1 Maximum value: 10 Default value: 1
cpu
Optional[str]
100
The amount of CPU (milli cpus) reserved per replica. Minimum value: 10 Maximum value: 4000 (4vCPUs) Default value: 100
memory
Optional[str]
256
The amount of memory (mebibytes) reserved per replica. Minimum value: 150 Maximum value: 16384 (16GiB) Default value: 256
created_at
datetime
-
Time at which model deployment was created.
updated_at
datetime
-
Latest time at which model deployment was updated.
created_by
-
Details of the user who created the model deployment.
updated_by
-
Details of the user who last updated the model deployment.
Update model deployment
Update an existing model deployment.
Parameters
active
Optional[bool]
True
Status of the deployment.
replicas
Optional[str]
1
The number of replicas running the model. Minimum value: 1 Maximum value: 10 Default value: 1
cpu
Optional[str]
100
The amount of CPU (milli cpus) reserved per replica. Minimum value: 10 Maximum value: 4000 (4vCPUs) Default value: 100
memory
Optional[str]
256
The amount of memory (mebibytes) reserved per replica. Minimum value: 150 Maximum value: 16384 (16GiB) Default value: 256
Usage
Returns
No
Raises
BadRequest
If field is not updatable.
Organizations
Organization in which all the projects, models are present.
Organization:
Organization object contains the below parameters.
id
UUID
-
Unique identifier for the organization.
name
str
-
Unique name of the organization.
created_at
datetime
-
Time at which organization was created.
updated_at
datetime
-
Latest time at which organization was updated.
Projects
Projects are used to organize your models and datasets. Each project can represent a machine learning task (e.g. predicting house prices, assessing creditworthiness, or detecting fraud).
A project can contain one or more models (e.g. lin_reg_house_predict, random_forest_house_predict).
Project
Project object contains the below parameters.
id
UUID
None
Unique identifier for the project.
name
str
None
Unique name of the project.
created_at
datetime
None
Time at which project was created.
updated_at
datetime
None
Latest time at which project was updated.
created_by
None
Details of the who created the project.
updated_by
None
Details of the who last updated the project.
organization
None
Details of the organization to which the project belongs.
create()
Creates a project using the specified name.
Parameters
name
str
None
Unique name of the project.
Usage
Returns
Project instance.
Raises
Conflict
Project with same name may exist.
get()
Get project from Fiddler Platform based on UUID.
Parameters
id_
UUID
None
Unique identifier for the project.
Usage
Returns
Project instance.
Raises
NotFound
Project with given identifier not found.
Forbidden
Current user may not have permission to view details of project.
from_name()
Get project from Fiddler Platform based on name.
Parameters
project_name
str
None
Name of the project.
Usage
Returns
Project instance.
Raises
NotFound
Project not found in the given project name.
Forbidden
Current user may not have permission to view details of project.
get_or_create()
Added in version 3.7.0
Get the project instance if exists, otherwise create a new project.
Parameters
name
str
None
Unique name of the project.
Usage
Returns
Project instance.
Raises
Forbidden
Current user may not have permission to view/create a project.
list()
Gets all projects in an organization.
Parameters
No
Returns
Iterable of project objects.
Errors
Forbidden
Current user may not have permission to the given project.
Usage example
delete()
Delete a project.
Parameters
id_
UUID
None
Unique UUID of the project .
Usage
Returns
None
Properties
List models()
List all models associated with a project.
Parameters
id_
UUID
None
Unique UUID of the project .
Usage
Returns
Iterable of model objects.
Raises
NotFound
Project with given identifier not found.
Forbidden
Current user may not have permission to view details of project.
Segments
Fiddler offers the ability to segment your data based on a custom condition.
Segment
Segment object contains the below parameters.
id
UUID
-
Unique identifier for the segment.
name
str
-
Segment name.
model_id
UUID
-
UUID of the model to which segment belongs.
definition
str
-
Definition of the segment.
description
Optional[str]
None
Description of the segment.
created_at
datetime
-
Time of creation of segment.
constructor()
Initialize a new segment.
Usage params
name
str
-
Segment name.
model_id
UUID
-
UUID of the model to which segment belongs.
definition
str
-
Definition of the segment.
description
Optional[str]
None
Description of the segment.
Usage
get()
Get segment from Fiddler Platform based on UUID.
Parameters
id_
UUID
-
Unique identifier for the segment.
Usage
Returns
Segment instance.
Raises
NotFound
Segment with given identifier not found.
Forbidden
Current user may not have permission to view details of segment.
from_name()
Get segment from Fiddler Platform based on name and model UUID.
Parameters
name
str
-
Name of the segment.
model_id
UUID | str
-
Unique identifier for the model.
Usage
Returns
Segment instance.
Raises
NotFound
Segment with given identifier not found.
Forbidden
Current user may not have permission to view details of segment.
list()
List all segments in the given model.
Parameters
model_id
UUID
-
UUID of the model associated with the segment.
Usage
Returns
Iterable of all segment objects.
Raises
Forbidden
Current user may not have permission to view details of segment.
create()
Adds a segment to a model.
Parameters
No
Usage
Returns
Segment instance.
Raises
Conflict
Segment with same name may exist for the model.
BadRequest
Invalid definition.
NotFound
Given model may not exist .
delete()
Delete a segment.
Parameters
No
Usage
Returns
No
Raises
NotFound
Segment with given identifier not found.
Forbidden
Current user may not have permission to delete segment.
Webhooks
Webhooks integration for alerts to be posted on Slack or other apps.
Webhook()
Webhook object contains the below parameters.
id
UUID
-
Unique identifier for the webhook.
name
str
-
Unique name of the webhook.
url
str
-
Webhook integration URL.
provider
-
App in which the webhook needs to be integrated. Either 'SLACK' or 'OTHER'
created_at
datetime
-
Time at which webhook was created.
updated_at
datetime
-
Latest time at which webhook was updated.
constructor()
Initialize a new webhook.
Parameters
name
str
-
Unique name of the webhook.
url
str
-
Webhook integration URL.
provider
-
App in which the webhook needs to be integrated.
Usage
get()
Gets all details of a particular webhook from UUID.
Parameters
id_
UUID
-
Unique identifier for the webhook.
Usage
Returns
Webhook instance.
Raises
NotFound
Webhook with given identifier not found.
Forbidden
Current user may not have permission to view details of webhook.
from_name()
Get Webhook from Fiddler Platform based on name.
Parameters
name
str
-
Name of the webhook.
Usage
Returns
Webhook instance.
Raises
NotFound
Webhook with given name not found.
Forbidden
Current user may not have permission to view details of webhook.
list()
Gets all webhooks accessible to a user.
Parameters
No
Usage
Returns
Iterable of webhook objects.
Raises
Forbidden
Current user may not have permission to view details of webhook.
create()
Create a new webhook.
Parameters
No
Usage
Returns
Webhook object.
update()
Update an existing webhook.
Parameters
name
str
-
Unique name of the webhook.
url
str
-
Webhook integration URL.
provider
-
App in which the webhook needs to be integrated.
Usage
Returns
None
Raises
BadRequest
If field is not updatable.
delete()
Delete a webhook.
Parameters
id_
UUID
-
Unique UUID of the webhook.
Usage
Returns
None
Explainability
Explainability methods for models.
precompute_feature_importance
Pre-compute feature importance for a model on a dataset. This is used in various places in the UI. A single feature importance can be precomputed (computed and cached) for a model.
Parameters
dataset_id
UUID
-
The unique identifier of the dataset.
num_samples
Optional[int]
None
The number of samples used.
num_iterations
Optional[int]
None
The maximum number of ablated model inferences per feature.
num_refs
Optional[int]
None
The number of reference points used in the explanation.
ci_level
Optional[float]
None
The confidence level (between 0 and 1).
update
Optional[bool]
False
Flag to indicate whether the precomputed feature importance should be recomputed and updated.
Usage
Returns
Async job details for the pre-compute job .
get_precomputed_feature_importance
Get pre-computed global feature importance for a model over a dataset or a slice.
Parameters
No
Usage
Returns
Tuple
A named tuple with the feature importance results .
get_feature_importance()
Get global feature importance for a model over a dataset or a slice.
Usage params
data_source
-
Dataset data Source for the input dataset to compute feature importance on
num_iterations
Optional[int]
None
The maximum number of ablated model inferences per feature.
num_refs
Optional[int]
None
The number of reference points used in the explanation.
ci_level
Optional[float]
None
The confidence level (between 0 and 1).
Usage
Returns
Tuple
A named tuple with the feature importance results .
Raises
BadRequest
If dataset id is not specified.
precompute_feature_impact()
Pre-compute feature impact for a model on a dataset. This is used in various places in the UI. A single feature impact can be precomputed (computed and cached) for a model.
Usage params
dataset_id
UUID
-
The unique identifier of the dataset.
num_samples
Optional[int]
None
The number of samples used.
num_iterations
Optional[int]
None
The maximum number of ablated model inferences per feature.
num_refs
Optional[int]
None
The number of reference points used in the explanation.
ci_level
Optional[float]
None
The confidence level (between 0 and 1).
min_support
Optional[int]
15
Only used for NLP (TEXT inputs) models. Specify a minimum support (number of times a specific word was present in the sample data) to retrieve top words. Default to 15.
update
Optional[bool]
False
Flag to indicate whether the precomputed feature impact should be recomputed and updated.
Usage
Returns
Async job details for the pre-compute job .
upload_feature_impact()
Upload a custom feature impact for a model of input type TABULAR
. All input features need to be passed for the method to run successfully. Partial upload of feature impacts are not supported.
Usage params
feature_impact_map
dict
-
Feature impacts dictionary with feature name as key and impact as value. Impact value is of type float and can be positive, negative or zero.
update
Optional[bool]
False
Flag to indicate whether the feature impact is being uploaded or updated.
Usage
Returns
Dict
Dictionary with feature_names, feature_impact_scores, system_generated, model_task, model_input_type, created_at.
get_precomputed_feature_impact()
Get pre-computed global feature impact for a model over a dataset or a slice.
Parameters
No
Usage
Returns
Tuple
A named tuple with the feature impact results .
get_feature_impact()
Get global feature impact for a model over a dataset or a slice.
Parameters
data_source
-
Dataset data Source for the input dataset to compute feature importance
num_iterations
Optional[int]
None
The maximum number of ablated model inferences per feature.
num_refs
Optional[int]
None
The number of reference points used in the explanation.
ci_level
Optional[float]
None
The confidence level (between 0 and 1).
min_support
Optional[int]
15
Only used for NLP (TEXT inputs) models. Specify a minimum support (number of times a specific word was present in the sample data)to retrieve top words. Default to 15.
output_columns
Optional[list[str]]
None
Only used for NLP (TEXT inputs) models. Output column names to compute feature impact on.
Usage
Returns
Tuple
A named tuple with the feature impact results .
Raises
BadRequest
If dataset id is not specified or query is not valid.
precompute_predictions()
Pre-compute predictions for a model on a dataset.
Parameters
dataset_id
UUID
-
Unique identifier of the dataset used for prediction.
chunk_size
Optional[int]
None
Chunk size for fetching predictions.
update
Optional[bool]
False
Flag to indicate whether the pre-computed predictions should be re-computed and updated for this dataset.
Usage
Returns
Async job details for the prediction job .
explain()
Get explanation for a single observation.
Parameters
input_data_source
-
DataSource for the input data to compute explanation on (RowDataSource, EventIdDataSource).
ref_data_source
None
Dataset data Source for the reference data to compute explanation. Only used for non-text models and the following methods: 'SHAP', 'FIDDLER_SHAP', 'PERMUTE', 'MEAN_RESET'.
method
ExplainMethod.FIDDLER_SHAP
Explanation method name. Could be your custom explanation method or one of the following method: 'SHAP', 'FIDDLER_SHAP', 'IG', 'PERMUTE', 'MEAN_RESET', 'ZERO_RESET'.
num_permutations
Optional[int]
None
For Fiddler SHAP, that corresponds to the number of coalitions to sample to estimate the Shapley values of each single-reference game. For the permutation algorithms, this corresponds to the number of permutations from the dataset to use for the computation.
ci_level
Optional[float]
None
The confidence level (between 0 and 1) to use for the confidence intervals in Fiddler SHAP. Not used for other methods.
top_n_class
Optional[int]
None
For multiclass classification models only, specifying if only the n top classes are computed or all classes (when parameter is None).
Usage
Return params
Tuple
A named tuple with the explanation results.
Raises
NotSupported
If specified source type is not supported.
download_data()
Download data using environment and segments, to csv or parquet file. 10M rows is the max size that can be downloaded.
Parameters
output_dir
Union[Path, str]
-
Path to download the file.
env_type
-
Type of environment to query (PRODUCTION or PRE_PRODUCTION)
env_id
UUID
None
If PRE_PRODUCTION env selected, provide the uuid of the dataset to query.
start_time
Optional[datetime]
None
Start time to retrieve data, only for PRODUCTION env. If no time zone is indicated, UTC is assumed.
end_time
Optional[datetime]
None
End time to retrieve data, only for PRODUCTION env. If no time zone is indicated, UTC is assumed.
segment_id
Optional[UUID]
None
Optional segment UUID to query data using a saved segment associated with the model
segment_definition
Optional[str]
None
Optional segment FQL definition to query data using an applied segment. This segment will not be saved to the model.
columns
Optional[List[str]]
None
Allows caller to explicitly specify list of columns to retrieve. Default to None which fetch all columns from the model.
max_rows
Optional[int]
None
Number of maximum rows to fetch.
chunk_size
Optional[int]
1000
Number of rows per chunk to download data. You can increase that number for faster download if you query less than 1000 columns and don't have vector columns.
fetch_vectors
Optional[bool]
None
Whether the vectors columns are fetched or not. Default to False.
output_format
PARQUET
Indicating if the result should be a CSV file or a PARQUET file.
Usage
Returns
Parquet or CSV file with slice data contents downloaded to the Path mentioned in output_dir.
Raises
BadRequest
If given segment is not implemented correctly
predict()
Run model on an input dataframe.
Parameters
df
pd.DataFrame
None
Feature dataframe.
chunk_size
Optional[int]
None
Chunk size for fetching predictions.
Usage
Returns
Dataframe
A pandas DataFrame of the predictions.
Constants
ModelInputType
Input data type used by the model.
ModelInputType.TABULAR
For tabular models.
ModelInputType.TEXT
For text models.
ModelInputType.MIXED
For models which can be a mixture of text and tabular.
ModelTask
The model’s algorithm type.
ModelTask.REGRESSION
For regression models.
ModelTask.BINARY_CLASSIFICATION
For binary classification models.
ModelTask.MULTICLASS_CLASSIFICATION
For multiclass classification models.
ModelTask.RANKING
For ranking classification models.
ModelTask.LLM
For LLM models.
ModelTask.NOT_SET
For other model tasks or no model task specified.
DataType
The available data types when defining a model Column.
DataType.FLOAT
For floats.
DataType.INTEGER
For integers.
DataType.BOOLEAN
For booleans.
DataType.STRING
For strings.
DataType.CATEGORY
For categorical types.
DataType.TIMESTAMP
For 32-bit Unix timestamps.
DataType.VECTOR
For vector types
CustomFeatureType
This is an enumeration defining the types of custom features that can be created.
CustomFeatureType.FROM_COLUMNS
Represents custom features derived directly from columns.
CustomFeatureType.FROM_VECTOR
Represents custom features derived from a vector column.
CustomFeatureType.FROM_TEXT_EMBEDDING
Represents custom features derived from text embeddings.
CustomFeatureType.FROM_IMAGE_EMBEDDING
Represents custom features derived from image embeddings.
CustomFeatureType.ENRICHMENT
Represents custom features derived from an enrichment.
ArtifactType
Indicator of type of a model artifact.
ArtifactType.SURROGATE
For surrogates.
ArtifactType.PYTHON_PACKAGE
For python package.
DeploymentType
Indicator of how the model was deployed.
DeploymentType.BASE_CONTAINER
For base containers.
DeploymentType.MANUAL
For manual deployment.
EnvType
Environment type of a dataset.
EnvType.PRODUCTION
For production events.
EnvType.PRE_PRODUCTION
For pre production events.
BaselineType
Type of a baseline.
BaselineType.STATIC
For static production baseline.
BaselineType.ROLLING
For rolling production baseline.
DownloadFormat
File format to download
DownloadFormat.PARQUET
Download data into a Parquet file
DownloadFormat.CSV
Download data into a CSV file
WindowBinSize
Window for rolling baselines.
WindowBinSize.HOUR
For rolling window to be 1 hour.
WindowBinSize.DAY
For rolling window to be 1 day.
WindowBinSize.WEEK
For rolling window to be 1 week.
WindowBinSize.MONTH
For rolling window to be 1 month.
WebhookProvider
Specifies the integration provider or OTHER for generic callback response.
WebhookProvider.SLACK
For slack.
WebhookProvider.OTHER
For any other app.
AlertCondition
Specifies the comparison operator to use for an alert threshold value.
AlertCondition.GREATER
The greater than operator.
AlertCondition.LESSER
the less than operator.
BinSize
Specifies the comparison operator to use for an alert threshold value.
BinSize.HOUR
The 1 hour bin.
BinSize.DAY
the 1 day bin.
BinSize.WEEK
The 7 day bin.
BinSize.MONTH
The 30 day bin.
CompareTo
Specifies the type of evaluation to use for an alert.
CompareTo.RAW_VALUE
For an absolute comparison of a specified value to the alert metric
CompareTo.TIME_PERIOD
For a relative comparison of the alert metric to the same metric from a previous time period.
Priority
Priority level label for alerts.
Priority.LOW
The low priority label.
Priority.MEDIUM
The medium priority label.
Priority.HIGH
The high priority label.
Severity
Severity level for alerts.
Severity.DEFAULT
For AlertRule when none of the thresholds have passed.
Severity.WARNING
For AlertRule when alert crossed the warning_threshold but not the critical_threshold.
Severity.CRITICAL
For AlertRule when alert crossed the critical_raw_threshold.
Alert Metric ID
AlertRule metric_id parameter constants.
Drift
jsd
Jensen-Shannon Distance
psi
Population Stability Index
Service Metrics
traffic
Traffic
Data Integrity
null_violation_count
Missing Value Violation
type_violation_count
Type Violation
range_violation_count
Range Violation
any_violation_count
Any Violation
null_violation_percentage
% Missing Value Violation
type_violation_percentage
% Type Violation
range_violation_percentage
% Range Violation
any_violation_percentage
% Any Violation
Statistics
sum
Sum
average
Average
frequency
Frequency
Performance
accuracy
Accuracy
log_loss
Log Loss
map
MAP
ndcg_mean
NDhorCG
query_count
Query Count
precision
Precision
recall
Recall / TPR
f1_score
F1
geometric_mean
Geometric Mean
data_count
Total Count
expected_calibration_error
Expected Calibration Error
auc
AUC
auroc
AUROC
calibrated_threshold
Calibrated Threshold
fpr
False Positive Rate
Custom Metrics
UUID of custom metric
Custom Metric Name
Schemas
Column
A model column representation.
name
str
None
Column name provided by the customer.
data_type
None
List of columns.
min
Union[int, float]
None
Min value of integer/float column.
max
Union[int, float]
None
Max value of integer/float column.
categories
list
None
List of unique values of a categorical column.
bins
list[Union[int, float]]
None
Bins of integer/float column.
replace_with_nulls
list
None
Replace the list of given values to NULL if found in the events data.
n_dimensions
int
None
Number of dimensions of a vector column.
fdl.Enrichment (Private Preview)
name
str
The name of the custom feature to generate
enrichment
str
The enrichment operation to be applied
columns
List[str]
The column names on which the enrichment depends
config
Optional[List]
{}
(optional): Configuration specific to an enrichment operation which controls the behavior of the enrichment
Note
Enrichments are disabled by default. To enable them, contact your administrator. Failing to do so will result in an error during the add_model
call.
Embedding (Private Preview)
Supported Models:
BAAI/bge-small-en-v1.5
small
Sentence Transformer
sentence-transformers/all-MiniLM-L6-v2
med
Sentence Transformer
thenlper/gte-base
med
Sentence Transformer
(default)
gpt2
med
Encoder NLP Transformer
last_token
distilgpt2
small
Encoder NLP Transformer
last_token
EleuteherAI/gpt-neo-125m
med
Encoder NLP Transformer
last_token
google/bert_uncased_L-4_H-256_A-4
small
Decoder NLP Transformer
first_token
Smallest Bert
bert-base-cased
med
Decoder NLP Transformer
first_token
distilroberta-base
med
Decoder NLP Transformer
first_token
xlm-roberta-large
large
Decoder NLP Transformer
first_token
Multilingual
roberta-large
large
Decoder NLP Transformer
first_token
If embeddings have already been generated for any field and sent to Fiddler, they can be imported for visualization in UMAP by modifying the column
field of TextEmbedding to be the column with the embeddings. The Embedding enrichment can also be removed for the corresponding input field, as there is no need for Fiddler to generate the embeddings in the case that embeddings are prepopulated and imported into Fiddler.
The above example will lead to generation of new column:
FDL Question Embedding
vector
Embeddings corresponding to string column question
.
Note
In the context of Hugging Face models, particularly transformer-based models used for generating embeddings, the pooling_method determines how the model processes the output of its layers to produce a single vector representation for input sequences (like sentences or documents). This is crucial when using these models for tasks like sentence or document embedding, where you need a fixed-size vector representation regardless of the input length.
Centroid Distance (Private Preview)
The above example will lead to generation of new column:
FDL Centroid Distance (question_embedding)
float
Distance from the nearest K-Means centroid present in
question_embedding
.
Note
Does not calculate membership for preproduction data, so you cannot calculate drift. Centroid Distance is automatically added if the TextEmbedding
enrichment is created for any given model.
Personally Identifiable Information (Private Preview)
List of PII entities
CREDIT_CARD
Pattern match and checksum
4111111111111111
378282246310005
(American Express)
CRYPTO
A Crypto wallet number. Currently only Bitcoin address is supported
Pattern match, context and checksum
1BoatSLRHtKNngkdXEeobR76b53LETtpyT
DATE_TIME
Absolute or relative dates or periods or times smaller than a day.
Pattern match and context
../2024
EMAIL_ADDRESS
An email address identifies an email box to which email messages are delivered
Pattern match, context and RFC-822 validation
trust@fiddler.ai
IBAN_CODE
The International Bank Account Number (IBAN) is an internationally agreed system of identifying bank accounts across national borders to facilitate the communication and processing of cross border transactions with a reduced risk of transcription errors.
Pattern match, context and checksum
DE89 3704 0044 0532 0130 00
IP_ADDRESS
An Internet Protocol (IP) address (either IPv4 or IPv6).
Pattern match, context and checksum
1.2.3.4
127.0.0.12/16
1234:BEEF:3333:4444:5555:6666:7777:8888
LOCATION
Name of politically or geographically defined location (cities, provinces, countries, international regions, bodies of water, mountains
Custom logic and context
PALO ALTO Japan
PERSON
A full person name, which can include first names, middle names or initials, and last names.
Custom logic and context
Joanna Doe
PHONE_NUMBER
A telephone number
Custom logic, pattern match and context
5556667890
URL
A URL (Uniform Resource Locator), unique identifier used to locate a resource on the Internet
Pattern match, context and top level url validation
US SSN
A US Social Security Number (SSN) with 9 digits.
Pattern match and context
1234-00-5678
US_DRIVER_LICENSE
Pattern match and context
US_ITIN
US Individual Taxpayer Identification Number (ITIN). Nine digits that start with a "9" and contain a "7" or "8" as the 4 digit.
Pattern match and context
912-34-1234
US_PASSPORT
A US passport number begins with a letter, followed by eight numbers
Pattern match and context
L12345678
The above example will lead to generation of new columns:
FDL Rag PII (question)
bool
Whether any PII was detected.
FDL Rag PII (question) Matches
str
What matches in raw text were flagged as potential PII (ex. ‘Douglas MacArthur,Korean’)?
FDL Rag PII (question) Entities
str
What entites these matches were tagged as (ex. 'PERSON')?
Note
PII enrichment is integrated with Presidio
Evaluate (Private Preview)
Here is a summary of the three evaluation metrics for natural language generation:
bleu
Measures precision of word n-grams between generated and reference texts
Simple, fast, widely used
Ignores recall, meaning, and word order
rouge
Measures recall of word n-grams and longest common sequences
Captures more information than BLEU
Still relies on word matching, not semantic similarity
meteor
Incorporates recall, precision, and additional semantic matching based on stems and paraphrasing
More robust and flexible than BLEU and ROUGE
Requires linguistic resources and alignment algorithms
The above example generates 6 new columns:
FDL QA Evaluate (bleu)
float
FDL QA Evaluate (rouge1)
float
FDL QA Evaluate (rouge2)
float
FDL QA Evaluate (rougel)
float
FDL QA Evaluate (rougelsum)
float
FDL QA Evaluate (meteor)
float
Textstat (Private Preview)
**Supported Statistics **
char_count
Total number of characters in text, including everything.
Assessing text length, useful for platforms with character limits.
letter_count
Total number of letters only, excluding numbers, punctuation, spaces.
Gauging text complexity, used in readability formulas.
miniword_count
Count of small words (usually 1-3 letters).
Specific readability analyses, especially for simplistic texts.
words_per_sentence
Average number of words in each sentence.
Understanding sentence complexity and structure.
polysyllabcount
Number of words with more than three syllables.
Analyzing text complexity, used in some readability scores.
lexicon_count
Total number of words in the text.
General text analysis, assessing overall word count.
syllable_count
Total number of syllables in the text.
Used in readability formulas, measures text complexity.
sentence_count
Total number of sentences in the text.
Analyzing text structure, used in readability scores.
flesch_reading_ease
Readability score indicating how easy a text is to read (higher scores = easier).
Assessing readability for a general audience.
smog_index
Measures years of education needed to understand a text.
Evaluating text complexity, especially for higher education texts.
flesch_kincaid_grade
Grade level associated with the complexity of the text.
Educational settings, determining appropriate grade level for texts.
coleman_liau_index
Grade level needed to understand the text based on sentence length and letter count.
Assessing readability for educational purposes.
automated_readability_index
Estimates the grade level needed to comprehend the text.
Evaluating text difficulty for educational materials.
dale_chall_readability_score
Assesses text difficulty based on a list of familiar words for average American readers.
Determining text suitability for average readers.
difficult_words
Number of words not on a list of commonly understood words.
Analyzing text difficulty, especially for non-native speakers.
linsear_write_formula
Readability formula estimating grade level of text based on sentence length and easy word count.
Simplifying texts, especially for lower reading levels.
gunning_fog
Estimates the years of formal education needed to understand the text.
Assessing text complexity, often for business or professional texts.
long_word_count
Number of words longer than a certain length (often 6 or 7 letters).
Evaluating complexity and sophistication of language used.
monosyllabcount
Count of words with only one syllable.
Readability assessments, particularly for simpler texts.
The above example leads to the creation of two additional columns:
FDL Text Statistics (question) char_count
int
Character count of string in question
column.
FDL Text Statistics (question) dale_chall_readability_score
float
Readability score of string in question
column.
Sentiment (Private Preview)
The above example leads to creation of two columns:
FDL Question Sentiment (question) compound
float
Raw score of sentiment.
FDL Question Sentiment (question) sentiment
string
One of positive
, negative
and `neutral.
Profanity (Private Preview)
The above example leads to creation of two columns:
FDL Profanity (prompt) contains_profanity
bool
To indicate if input contains profanity in the value of the prompt column.
FDL Profanity (response) contains_profanity
bool
To indicate if input contains profanity in the value of the response column.
Answer Relevance (Private Preview)
The above example will lead to the generation of a new column
FDL Answer Relevance
bool
Binary metric, which is True if response
is relevant to the prompt
.
Faithfulness (Private Preview)
The above example will lead to generation of new column:
FDL Faithfulness
bool
Binary metric, which is True if the facts used inresponse
is correctly used from the context
columns.
Coherence (Private Preview)
\
The above example will lead to generation of new column:
FDL Coherence
bool
Binary metric, which is True ifresponse
makes coherent arguments which flow well.
Conciseness (Private Preview)
The above example will lead to generation of new column:
FDL Conciseness
Binary metric, which is True ifresponse
is concise, and not overly verbose.
Toxicity (Private Preview)
Toxic-Chat
0.4
0.64
0.24
Usage
The code snippet shows how to enable toxicity scoring on the prompt
and response
columns for each event published to Fiddler.
The above example leads to creation of two columns each for prompt and response that contain the prediction probability and the model decision.
For example for the prompt column following two columns will be generated
FDL Toxicity (prompt) toxicity_prob
float
Model prediction probability between 0-1.
FDL Toxicity (prompt) contains_toxicity
bool
Model prediction either 0 or 1.
Regex Match (Private Preview)
The above example will lead to generation of new column
FDL Regex - only digits
category
Match or No Match, depending on the regex specified in config matching in the string.
Topic (Private Preview)
\
The above example leads to creation of two columns -
FDL Topics (response) topic_model_scores
list[float]
Indicating probability of the given column in each of the topics specified in the Enrichment config. Each float value indicate probability of the given input classified in the corresponding topic, in the same order as topics. Each value will be between 0 and 1. The sum of values does not equal to 1, as each classification is performed independently of other topics.
FDL Topics (response) max_score_topic
string
Topic with the maximum score from the list of topic names specified in the Enrichment config.
Banned Keyword Detector (Private Preview)
\
The above example leads to creation of two columns -
FDL Banned KW (prompt) contains_banned_kw
bool
To indicate if input contains one of the specified banned keywords in the value of the prompt column.
FDL Banned KW (response) contains_banned_kw
bool
To indicate if input contains one of the specified banned keywords in the value of the response column.
Language Detector (Private Preview)
Language detector leverages fasttext models for language detection.
The above example leads to creation of two columns -
FDL Language (prompt) language
string
Language prediction for input text
FDL Language (prompt) language_probability
float
To indicate the confidence probability of language prediction
Fast Safety (Private Preview)
The Fast safety enrichment evaluates the safety of the text along ten different dimensions: illegal, hateful, harassing, racist, sexist, violent, sexual, harmful, unethical, jailbreaking
. These dimensions are all returned by default, but can be selectively chosen as needed. Fast safety is generated through the Fast Trust Models.
\
The above example leads to creation of a column for each dimension. -
FDL Prompt Safety (prompt) dimension
bool
Binary metric, which is True if the input is deemed unsafe, False otherwise
FDL Prompt Safety (prompt) dimension
score
float
To indicate the confidence probability of safety prediction
Fast Faithfulness (Private Preview)
The Fast faithfulness enrichment is designed to evaluate the accuracy and reliability of facts presented in AI-generated text responses. Fast safety is generated through the Fast Trust Models.
The above example leads to creation of two columns -
FDL Faithfulness faithful
bool
Binary metric, which is True if the facts used inresponse
is correctly used from the context
columns.
FDL Faithfulness faithful score
float
To indicate the confidence probability of faithfulness prediction
SQL Validation (Private Preview)
Query validation is syntax based and does not check against any existing schema or databases for validity.
The SQL Validation enrichment is designed to evaluate different query dialects for syntax correctness.
The above example leads to creation of two columns -
SQL Validator valid
bool
True if the query string is syntactically valid for the specified dialect, False if not.
SQL Validator errors
str
If syntax errors are found they will be present as a JSON serialized string containing a list of dictionaries describing the errors.
JSON Validation (Private Preview)
The JSON Validation enrichment is designed to evaluate strings for correct JSON syntax and optionally against a user-defined schema for validation.
This enrichment uses the python-jsonschema library for json schema validation. The defined validation_schema
must be a valid python-jsonschema schema.
The above example leads to creation of two columns -
JSON Validator valid
bool
String is valid JSON.
JSON Validator errors
str
If the string failed to parse to JSON any parsing errors will be returned as a serialized json list of dictionaries,
ModelTaskParams
Task parameters given to a particular model.
binary_classification_threshold
float
None
Threshold for labels.
target_class_order
list
None
Order of target classes.
group_by
str
None
Query/session id column for ranking models.
top_k
int
None
Top k results to consider when computing ranking metrics.
class_weights
list[float]
None
Weight of each class.
weighted_ref_histograms
bool
None
Whether baseline histograms must be weighted or not when calculating drift metrics.
ModelSchema
Model schema defines the list of columns associated with a model version.
schema_version
int
1
Schema version.
columns
None
List of columns.
ModelSpec
Model spec defines how model columns are used along with model task.
schema_version
int
1
Schema version.
inputs
list[str]
None
Feature columns.
outputs
list[str]
None
Prediction columns.
targets
list[str]
None
Label columns.
decisions
list[str]
None
Decisions columns.
metadata
list[str]
None
Metadata columns
custom_features
None
Custom feature definitions.
CustomFeature
The base class for derived features such as Multivariate, VectorFeature, etc.
name
str
None
The name of the custom feature.
type
None
The type of custom feature. Must be one of the CustomFeatureType
enum values.
n_clusters
Optional[int]
5
The number of clusters.
centroids
Optional[List]
None
Centroids of the clusters in the embedded space. Number of centroids equal to n_clusters
.
Multivariate
Represents custom features derived from multiple columns.
type
CustomFeatureType.FROM_COLUMNS
Indicates this feature is derived from multiple columns.
columns
List[str]
None
List of original columns from which this feature is derived.
monitor_components
bool
False
Whether to monitor each column in columns
as individual feature. If set to True
, components are monitored and drift will be available.
VectorFeature
Represents custom features derived from a single vector column.
type
CustomFeatureType.FROM_VECTOR
Indicates this feature is derived from a single vector column.
source_column
Optional[str]
None
Specifies the original column if this feature is derived from an embedding.
column
str
None
The vector column name.
TextEmbedding
Represents custom features derived from text embeddings.
type
CustomFeatureType.FROM_TEXT_EMBEDDING
Indicates this feature is derived from a text embedding.
n_tags
Optional[int]
5
How many tags(tokens) the text embedding uses in each cluster as the tfidf
summarization in drift computation.
ImageEmbedding
Represents custom features derived from image embeddings.
type
CustomFeatureType.FROM_IMAGE_EMBEDDING
Indicates this feature is derived from an image embedding.
Enrichment
Represents custom features derived from enrichment.
type
CustomFeatureType.ENRICHMENT
Indicates this feature is derived from enrichment.
columns
List[str]
None
List of original columns from which this feature is derived.
enrichment
str
None
A string identifier for the type of enrichment to be applied.
config
Dict[str, Any]
None
A dictionary containing configuration options for the enrichment.
XaiParams
Represents the explainability parameters.
custom_explain_methods
List[str]
None
User-defined explain_custom methods of the model object defined in package.py.
default_explain_method
Optional[str]
NOne
Default explanation method.
DeploymentParams
Deployment parameters of a particular model.
artifact_type
str
Type of artifact upload.
deployment_type
None
Type of deployment.
image_uri
Optional[str]
md-base/python/python-311:1.0.0
replicas
Optional[str]
1
The number of replicas running the model. Minimum value: 1 Maximum value: 10 Default value: 1
cpu
Optional[str]
100
The amount of CPU (milli cpus) reserved per replica. Minimum value: 10 Maximum value: 4000 (4vCPUs) Default value: 100
memory
Optional[str]
256
The amount of memory (mebibytes) reserved per replica. Minimum value: 150 Maximum value: 16384 (16GiB) Default value: 256
RowDataSource
Explainability input source for row data.
row
Dict
None
Dictionary containing row details.
EventIdDataSource
Explainability input source for event data.
event_id
str
None
Unique ID for event.
env_id
Optional[Union[str, UUID]]
None
Unique ID for environment.
env_type
None
Environment type.
DatasetDataSource
Reference data source for explainability.
env_type
None
Environment type.
num_samples
Optional[int]
None
Number of samples to select for computation.
env_id
Optional[Union[str, UUID]]
None
Unique ID for environment.
Helper functions
set_logging
Set app logger at given log level.
Parameters
level
int
logging.INFO
Logging level from python logging module
Usage
Returns
None
group_by
Group the events by a column. Use this method to form the grouped data for ranking models.
Parameters
df
pd.DataFrame
-
Unique identifier for the AlertRule.
group_by_col
str
-
The column to group the data by.
output_path
Optional[Path
str]
-
Usage
Returns
pd.Dataframe
Dataframe in grouped format.
Last updated