API Methods 2.x

Connecting to Fiddler

fdl.FiddlerAPI

The Client object is used to communicate with Fiddler. In order to use the client, you'll need to provide authentication details as shown below.

For more information, see Authorizing the Client.

🚧 Warning

If verbose is set to True, all information required for debugging will be logged, including the authorization token.

📘 Info

To maximize compatibility, please ensure that your Client Version matches the server version for your Fiddler instance.

When you connect to Fiddler using the code on the right, you'll receive a notification if there is a version mismatch between the client and server.

You can install a specific version of fiddler-client using pip: pip install fiddler-client==X.X.X

import fiddler as fdl

URL = 'https://app.fiddler.ai'
ORG_ID = 'my_org'
AUTH_TOKEN = 'p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58'

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN
)
import fiddler as fdl

URL = 'https://app.fiddler.ai'
ORG_ID = 'my_org'
AUTH_TOKEN = 'p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58'

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN,
    verify=False
)
proxies = {
    'http' : 'http://proxy.example.com:1234',
    'https': 'https://proxy.example.com:5678'
}

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN,
    proxies=proxies
)

If you want to authenticate with Fiddler without passing this information directly into the function call, you can store it in a file named_ fiddler.ini_, which should be stored in the same directory as your notebook or script.

%%writefile fiddler.ini

[FIDDLER]
url = https://app.fiddler.ai
org_id = my_org
auth_token = p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58
client = fdl.FiddlerApi()


Projects

Projects are used to organize your models and datasets. Each project can represent a machine learning task (e.g. predicting house prices, assessing creditworthiness, or detecting fraud).

A project can contain one or more models (e.g. lin_reg_house_predict, random_forest_house_predict).

For more information on projects, click here.


client.list_projects

response = client.list_projects()
[
  'project_a',
  'project_b',
  'project_c'
]

client.create_project

PROJECT_ID = 'example_project'

client.create_project(
    project_id=PROJECT_ID
)
{
    'project_name': 'example_project'
}

client.delete_project

PROJECT_ID = 'example_project'

client.delete_project(
    project_id=PROJECT_ID
)
True

🚧 Caution

You cannot delete a project without deleting the datasets and the models associated with that project.



Datasets

Datasets (or baseline datasets) are used for making comparisons with production data.

A baseline dataset should be sampled from your model's training set, so it can serve as a representation of what the model expects to see in production.

For more information, see Uploading a Baseline Dataset.

For guidance on how to design a baseline dataset, see Designing a Baseline Dataset.


client.list_datasets

PROJECT_ID = "example_project"

client.list_datasets(
    project_id=PROJECT_ID
)
[
    'dataset_a',
    'dataset_b',
    'dataset_c'
]

client.upload_dataset

import pandas as pd

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

client.upload_dataset(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    dataset={
        'baseline': df
    },
    info=dataset_info
)
{'uuid': '7046dda1-2779-4987-97b4-120e6185cc0b',
 'name': 'Ingestion dataset Upload',
 'info': {'project_name': 'example_model',
  'resource_name': 'acme_data',
  'resource_type': 'DATASET'},
 'status': 'SUCCESS',
 'progress': 100.0,
 'error_message': None,
 'error_reason': None}

client.delete_dataset

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

client.delete_dataset(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)
'Dataset deleted example_dataset'

🚧 Caution

You cannot delete a dataset without deleting the models associated with that dataset first.


client.get_dataset_info

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

dataset_info = client.get_dataset_info(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)
#NA


Models

A model is a representation of your machine learning model. Each model must have an associated dataset to be used as a baseline for monitoring, explainability, and fairness capabilities.

You do not need to upload your model artifact in order to onboard your model, but doing so will significantly improve the quality of explanations generated by Fiddler.


client.add_model

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

dataset_info = client.get_dataset_info(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)

model_task = fdl.ModelTask.BINARY_CLASSIFICATION
model_target = 'target_column'
model_output = 'output_column'
model_features = [
    'feature_1',
    'feature_2',
    'feature_3'
]

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    target=model_target,
    outputs=[model_output],
    model_task=model_task
)

client.add_model(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    model_info=model_info
)

client.add_model_artifact

📘 Note

Before calling this function, you must have already added a model using add_model.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.add_model_artifact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir='model_dir/',
)

client.add_model_surrogate

📘 Note

Before calling this function, you must have already added a model using add_model.

🚧 Surrogate models are not supported for input_type = fdl.ModelInputType.TEXT

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.add_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

# with deployment_params
client.add_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    deployment_params=fdl.DeploymentParams(cpu=250, memory=500)
)

client.delete_model

For more information, see Uploading a Model Artifact.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.delete_model(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

client.get_model_info

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

model_info = client.get_model_info(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

client.list_models

PROJECT_ID = 'example_project'

client.list_models(
    project_id=PROJECT_ID
)
[
    'model_a',
    'model_b',
    'model_c'
]

client_register_model

❗️ Not supported with client 2.0 and above

Please use client.add_model() going forward.


client.trigger_pre_computation

❗️ Not supported with client 2.0 and above

This method is called automatically now when calling client.add_model_surrogate() or client.add_model_artifact().


client.update_model

For more information, see Uploading a Model Artifact.

🚧 Warning

This function does not allow for changes in a model's schema. The inputs and outputs to the model must remain the same.

import pathlib

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

model_dir = pathlib.Path('model_dir')

client.update_model(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir=model_dir
)
True

client.update_model_artifact

📘 Note

Before calling this function, you must have already added a model using add_model_surrogate or add_model_artifact

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.update_model_artifact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir='model_dir/',
)

client.update_model_package

❗️ Not supported with client 2.0 and above

Please use client.add_model_artifact() going forward.


client.update_model_surrogate

📘 Note

This method call cannot replace user uploaded model done using add_model_artifact. It can only re-generate a surrogate model

This can be used to re-generate a surrogate model for a model

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.update_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

# with deployment_params
client.update_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    deployment_params=fdl.DeploymentParams(cpu=250, memory=500)
)


Model Deployment

client.get_model_deployment

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'

client.get_model_deployment(
    project_id=PROJECT_NAME,
    model_id=MODEL_NAME,
)
{
  id: 106548,
  uuid: UUID("123e4567-e89b-12d3-a456-426614174000"),
  model_id: "MODEL_NAME",
  project_id : "PROJECT_NAME",
  organization_id: "ORGANIZATION_NAME",
  artifact_type: "PYTHON_PACKAGE",
  deployment_type: "BASE_CONTAINER",
  active: True,
  image_uri: "md-base/python/python-311:1.0.0",
  replicas: 1,
  cpu: 250,
  memory: 512,
  created_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  updated_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  created_at: datetime(2023, 1, 27, 10, 9, 39, 793829),
  updated_at: datetime(2023, 1, 30, 17, 3, 17, 813865),
  job_uuid: UUID("539j9630-a69b-98d5-g496-326117174805")
}

client.update_model_deployment

Example use cases

  • Horizontal scaling: horizontal scaling via replicas parameter. This will create multiple Kubernetes pods internally to handle requests.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    
    # Create 3 Kubernetes pods internally to handle requests
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        replicas=3,
    )
  • Vertical scaling: Model deployments support vertical scaling via cpu and memory parameters. Some models might need more memory to load the artifacts into memory or process the requests.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        cpu=500,
        memory=1024,
    )
  • Scale down: You may want to scale down the model deployments to avoid allocating the resources when the model is not in use. Use active parameters to scale down the deployment.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        active=False,
    )
  • Scale up: This will again create the model deployment Kubernetes pods with the resource values available in the database.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
          model_id=MODEL_NAME,
        active=True,
    )

Supported from server version 23.1 and above with Flexible Model Deployment feature enabled.

{
  id: 106548,
  uuid: UUID("123e4567-e89b-12d3-a456-426614174000"),
  model_id: "MODEL_NAME",
  project_id : "PROJECT_NAME",
  organization_id: "ORGANIZATION_NAME",
  artifact_type: "PYTHON_PACKAGE",
  deployment_type: "BASE_CONTAINER",
  active: True,
  image_uri: "md-base/python/python-311:1.0.0 ",
  replicas: 1,
  cpu: 250,
  memory: 512,
  created_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  updated_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  created_at: datetime(2023, 1, 27, 10, 9, 39, 793829),
  updated_at: datetime(2023, 1, 30, 17, 3, 17, 813865),
  job_uuid: UUID("539j9630-a69b-98d5-g496-326117174805")
}


Event Publication

Event publication is the process of sending your model's prediction logs, or events, to the Fiddler platform. Using the Fiddler Client, events can be published in batch or streaming mode. Using these events, Fiddler will calculate metrics around feature drift, prediction drift, and model performance. These events are also stored in Fiddler to allow for ad hoc segment analysis. Please read the sections that follow to learn more about how to use the Fiddler Client for event publication.


client.publish_event

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

example_event = {
    'feature_1': 20.7,
    'feature_2': 45000,
    'feature_3': True,
    'output_column': 0.79,
    'target_column': 1
}

client.publish_event(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    event=example_event,
    event_id='event_001',
    event_timestamp=1637344470000
)
'66cfbeb6-5651-4e8b-893f-90286f435b8d'

client.publish_events_batch

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

df_events = pd.read_csv('events.csv')

client.publish_events_batch(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        batch_source=df_events,
        id_field='event_id',
        timestamp_field='inference_date')

In this example, event_id and inference_date are columns in df_events. Both are optional. If not passed, we generate unique UUID and use current timestamp for event_timestmap.


PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

df_to_update = pd.read_csv('events_update.csv')

# event_id is a column in df_to_update
client.publish_events_batch(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        update_event=True,
        batch_source=df_to_update,
        id_field='event_id')

In case of update-event, id_field is required as a unique identifier of the previous published events. For more details on which columns are eligible to be updated, refer to Updating Events.


{'status': 202,
 'job_uuid': '4ae7bd3a-2b3f-4444-b288-d51e07b6736d',
 'files': ['ssoqj_tmpzmczjuob.csv'],
 'message': 'Successfully received the event data. Please allow time for the event ingestion to complete in the Fiddler platform.'}


Baselines

client.add_baseline

Add a pre-production baseline

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_pre'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'


client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.PRE_PRODUCTION,
  dataset_id=DATASET_NAME,
)

Add a static production baseline

from datetime import datetime
from fiddler import BaselineType, WindowSize

start = datetime(2023, 1, 1, 0, 0) # 12 am, 1st Jan 2023
end = datetime(2023, 1, 2, 0, 0) # 12 am, 2nd Jan 2023

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_static'
DATASET_NAME = 'example_dataset'
MODEL_NAME = 'example_model'
START_TIME = start.timestamp()
END_TIME = end.timestamp()


client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.STATIC_PRODUCTION,
  start_time=START_TIME,
  end_time=END_TIME,
)

Add a rolling time window baseline

from fiddler import BaselineType, WindowSize

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_rolling'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'

client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.ROLLING_PRODUCTION,
  offset=WindowSize.ONE_MONTH, # How far back to set our window
  window_size=WindowSize.ONE_WEEK, # Size of the sliding window
)

client.get_baseline

get_baseline helps get the configuration parameters of the existing baseline

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'
BASELINE_NAME = 'example_preconfigured'


baseline = client.get_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
)

client.list_baselines

Gets all the baselines in a project or attached to a single model within a project

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'

# list baselines across all models within a project
client.list_baselines(
  project_id=ROJECT_NAME
)

# list baselines within a model
client.list_baselines(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
)

client.delete_baseline

Deletes an existing baseline from a project

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'
BASELINE_NAME = 'example_preconfigured'


client.delete_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
)


Monitoring

client.add_monitoring_config

📘 Info

add_monitoring_config can be applied at the model, project, or organization level.

  • If project_id and model_id are specified, the configuration will be applied at the model level.

  • If project_id is specified but model_id is not, the configuration will be applied at the project level.

  • If neither project_id nor model_id are specified, the configuration will be applied at the organization level.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

monitoring_config = {
    'min_bin_value': 3600,
    'time_ranges': ['Day', 'Week', 'Month', 'Quarter', 'Year'],
    'default_time_range': 7200
}

client.add_monitoring_config(
    config_info=monitoring_config,
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

client.add_alert_rule

📘 Info

The Fiddler client can be used to create a variety of alert rules. Rules can be of Data Drift, Performance, Data Integrity, and **Service Metrics ** types and they can be compared to absolute (compare_to = RAW_VALUE) or to relative values (compare_to = TIME_PERIOD).

# To add a Performance type alert rule which triggers an email notification
# when precision metric is 5% higher than that from 1 hr bin one day ago.

import fiddler as fdl

notifications_config = client.build_notifications_config(
    emails = "user_1@abc.com, user_2@abc.com",
)
client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.PERFORMANCE,
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

# To add Data Integrity alert rule which triggers an email notification when
# published events have more than 5 null values in any 1 hour bin for the _age_ column.
# Notice compare_to = fdl.CompareTo.RAW_VALUE.

import fiddler as fdl

client.add_alert_rule(
    name = "age-null-1hr",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_INTEGRITY,
    metric = fdl.Metric.MISSING_VALUE,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    priority = fdl.Priority.HIGH,
    warning_threshold = 5,
    critical_threshold = 10,
    condition = fdl.AlertCondition.GREATER,
    column = "age",
    notifications_config = notifications_config
)
# To add a Data Drift type alert rule which triggers an email notification
# when PSI metric for 'age' column from an hr is 5% higher than that from 'baseline_name' dataset.

import fiddler as fdl

client.add_baseline(project_id='project-a',
                    model_id='model-a',
                    baseline_name='baseline_name',
                    type=fdl.BaselineType.PRE_PRODUCTION,
                    dataset_id='dataset-a')

notifications_config = client.build_notifications_config(
    emails = "user_1@abc.com, user_2@abc.com",
)

client.add_alert_rule(
    name = "psi-gt-5prec-age-baseline_name",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_DRIFT,
    metric = fdl.Metric.PSI,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config,
    columns = ["age"],
    baseline_id = 'baseline_name'
)
# To add Drift type alert rule which triggers an email notification when
# value of JSD metric is more than 0.5 for one hour bin for  _age_ or _gender_ columns.
# Notice compare_to = fdl.CompareTo.RAW_VALUE.

import fiddler as fdl
notifications_config = client.build_notifications_config(
    emails = "user_1@abc.com, user_2@abc.com",
)

client.add_alert_rule(
    name = "jsd_multi_col_1hr",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_DRIFT,
    metric = fdl.Metric.JSD,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    warning_threshold = 0.4,
    critical_threshold = 0.5,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config,
    columns = ["age", "gender"],
)
# To add Data Integrity alert rule which triggers an email notification when
# published events have more than 5 percent null values in any 1 hour bin for the _age_ column.

import fiddler as fdl

client.add_alert_rule(
    name = "age_null_percentage_greater_than_10",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_INTEGRITY,
    metric = 'null_violation_percentage',
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    priority = fdl.Priority.HIGH,
    warning_threshold = 5,
    critical_threshold = 10,
    condition = fdl.AlertCondition.GREATER,
    column = "age",
    notifications_config = notifications_config
)

Example responses:

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE,
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]
AlertRule(alert_rule_uuid='e1aefdd5-ef22-4e81-b869-3964eff8b5cd',
organization_name='some_org_name',
project_id='project-a',
model_id='model-a',
name='age-null-1hr',
alert_type=AlertType.DATA_INTEGRITY,
metric=Metric.MISSING_VALUE,
column='age',
priority=Priority.HIGH,
compare_to=CompareTo.RAW_VALUE,
compare_period=None,
warning_threshold=5,
critical_threshold=10,
condition=AlertCondition.GREATER,
bin_size=BinSize.ONE_HOUR)
AlertRule(alert_rule_uuid='e1aefdd5-ef22-4e81-b869-3964eff8b5cd',
organization_name='some_org_name',
project_id='project-a',
model_id='model-a',
name='psi-gt-5prec-age-baseline_name',
alert_type=AlertType.DATA_DRIFT,
metric=Metric.PSI,
priority=Priority.HIGH,
compare_to=CompareTo.RAW_VALUE,
compare_period=None,
warning_threshold=5,
critical_threshold=10,
condition=AlertCondition.GREATER,
bin_size=BinSize.ONE_HOUR,
columns=['age'],
baseline_id='baseline_name')
[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.DRIFT,
           metric=Metric.JSD,
           priority=Priority.HIGH,
           compare_to='CompareTo.RAW_VALUE,
           compare_period=ComparePeriod.ONE_HOUR,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.4,
           critical_threshold=0.5,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR,
           columns=['age', 'gender'])]

client.get_alert_rules

📘 Info

The Fiddler client can be used to get a list of alert rules with respect to the filtering parameters.


import fiddler as fdl

alert_rules = client.get_alert_rules(
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_INTEGRITY,
    metric = fdl.Metric.MISSING_VALUE,
    columns = ["age", "gender"],
    ordering = ['critical_threshold'], #['-critical_threshold'] for descending
    limit= 4, ## to set number of rules to show in one go
    offset = 0, # page offset (multiple of limit)
)

client.get_triggered_alerts

📘 Info

The Fiddler client can be used to get a list of triggered alerts for given alert rule and time duration.


trigerred_alerts = client.get_triggered_alerts(
    alert_rule_uuid = "588744b2-5757-4ae9-9849-1f4e076a58de",
    start_time = "2022-05-01",
    end_time = "2022-09-30",
    ordering = ['alert_time_bucket'], #['-alert_time_bucket'] for descending
    limit= 4, ## to set number of rules to show in one go
    offset = 0, # page offset
)

client.delete_alert_rule

📘 Info

The Fiddler client can be used to get a list of triggered alerts for given alert rule and time duration.


client.delete_alert_rule(
    alert_rule_uuid = "588744b2-5757-4ae9-9849-1f4e076a58de",
)

client.build_notifications_config

📘 Info

The Fiddler client can be used to build notification configuration to be used while creating alert rules.


notifications_config = client.build_notifications_config(
    emails = "name@abc.com",
)
notifications_config = client.build_notifications_config(
  emails = "name1@abc.com,name2@email.com",
  pagetduty_services = 'pd_service_1',
  pagerduty_severity = 'critical'
)
notifications_config = client.build_notifications_config(
    webhooks = ["894d76e8-2268-4c2e-b1c7-5561da6f84ae", "3814b0ac-b8fe-4509-afc9-ae86c176ef13"]
)

Example Response:

{'emails': {'email': 'name@abc.com'}, 'pagerduty': {'service': '', 'severity': ''}, 'webhooks': []}

client.add_webhook


client.add_webhook(
        name='range_violation_channel',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
        provider='SLACK')
)

Example responses:

Webhook(uuid='df2397d3-23a8-4eb3-987a-2fe43b758b08',
        name='range_violation_channel', organization_name='some_org_name',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
        provider='SLACK')

📘 Add Slack webhook

Use the Slack API reference to generate a webhook for your Slack App


client.delete_webhook


client.delete_webhook(
    uuid = "ffcc2ddf-f896-41f0-bc50-4e7b76bb9ace",
)

client.get_webhook


client.get_webhook(
    alert_rule_uuid = "a5f085bc-6772-4eff-813a-bfc20ff71002",
)

Example responses:

Webhook(uuid='a5f085bc-6772-4eff-813a-bfc20ff71002',
        name='binary_classification_alerts_channel',
        organization_name='some_org',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d,
        provider='SLACK')

client.get_webhooks

response = client.get_webhooks()

Example Response

[
  Webhook(uuid='e20bf4cc-d2cf-4540-baef-d96913b14f1b', name='model_1_alerts', organization_name='some_org', url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d', provider='SLACK'),
    Webhook(uuid='bd4d02d7-d1da-44d7-b194-272b4351cff7', name='drift_alerts_channel', organization_name='some_org', url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d', provider='SLACK'),
    Webhook(uuid='761da93b-bde2-4c1f-bb17-bae501abd511', name='project_1_alerts', organization_name='some_org', url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d', provider='SLACK')
]

client.update_webhook

client.update_webhook(uuid='e20bf4cc-d2cf-4540-baef-d96913b14f1b',
                      name='drift_violation',
                      url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
                      provider='SLACK')

Example Response:

Webhook(uuid='e20bf4cc-d2cf-4540-baef-d96913b14f1b',
        name='drift_violation', organization_name='some_org_name',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
        provider='SLACK')

client.update_alert_notification_status

📘 Info

The Fiddler client can be used to update the notification status of multiple alerts at once.


updated_alert_configs = client.update_alert_notification_status(
    notification_status = True,
    model_id = "9f8180d3-3fa0-40c4-8656-b9b1d2de1b69",
)
updated_alert_configs = client.update_alert_notification_status(
    notification_status = True,
    alert_config_ids = ["9b8711fa-735e-4a72-977c-c4c8b16543ae"],
)

Example responses:

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE,
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]


Custom Metrics

client.get_custom_metric

METRIC_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

custom_metric = client.get_custom_metric(
  metric_id=METRIC_ID
)

client.get_custom_metrics

PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

custom_metrics = client.get_custom_metrics(
  project_id=PROJECT_ID,
  model_id=MODEL_ID
)

client.add_custom_metric

For details on supported constants, operators, and functions, see Fiddler Query Language.

PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

definition = """
    average(if(Prediction < 0.5 and Target == 1, -40, if(Prediction >= 0.5 and Target == 0, -400, 250)))
"""

client.add_custom_metric(
    name='Loan Value',
    description='A custom value score assigned to a loan',
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    definition=definition
)

client.delete_custom_metric

METRIC_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

client.delete_custom_metric(
  metric_id=METRIC_ID
)


Segments

client.get_segment

SEGMENT_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

segment = client.get_segment(
  segment_id=SEGMENT_ID
)

client.get_segments

PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

custom_metrics = client.get_segments(
  project_id=PROJECT_ID,
  model_id=MODEL_ID
)

client.add_segment

For details on supported constants, operators, and functions, see Fiddler Query Language.

PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

definition = """
    age > 50
"""

client.add_segment(
    name='Over 50',
    description='All people over the age of 50',
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    definition=definition
)
Segment(
  id='50a1c32d-c2b4-4faf-9006-f4aeadd7a859',
  name='Over 50',
  project_name='my_project',
  organization_name='mainbuild',
  definition='age > 50',
  description='All people over the age of 50',
  created_at=None,
  created_by=None
)

client.delete_segment

SEGMENT_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

client.delete_segment(
  segment_id=SEGMENT_ID
)


Explainability

client.get_predictions

import pandas as pd

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

input_df = pd.read_csv('example_data.csv')

# Example without chunk size specified:
predictions = client.get_predictions(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_df=input_df,
)


# Example with chunk size specified:
predictions = client.get_predictions(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_df=input_df,
    chunk_size=1000,
)

client.get_explanation

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset

# FIDDLER SHAP - Dataset reference data source
row = df.to_dict(orient='records')[0]
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    ref_data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=300),
    explanation_type='FIDDLER_SHAP',
    num_permutations=200,
    ci_level=0.95,
)

# FIDDLER SHAP - Slice ref data source
row = df.to_dict(orient='records')[0]
query = f'SELECT * from {DATASET_ID}.{MODEL_ID} WHERE sulphates >= 0.8'
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    ref_data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=100),
    explanation_type='FIDDLER_SHAP',
    num_permutations=200,
    ci_level=0.95,
)

# FIDDLER SHAP - Multi-class classification (top classes)
row = df.to_dict(orient='records')[0]
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    ref_data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID),
    explanation_type='FIDDLER_SHAP',
    top_n_class=2
)

# IG (Not available by default, need to be enabled via package.py)
row = df.to_dict(orient='records')[0]
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    explanation_type='IG',
)

client.get_feature_impact

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'

# Feature Impact for TABULAR data - Dataset Data Source
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Impact for TABULAR data - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=80),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Impact for TEXT data
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=50),
    output_columns= ['probability_A', 'probability_B'],
    min_support=30
)

client.get_feature_importance

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'


# Feature Importance - Dataset data source
feature_importance = client.get_feature_importance(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Importance - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
feature_importance = client.get_feature_importance(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=80),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

client.get_mutual_information

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
mutual_info = client.get_mutual_information(
  project_id=PROJECT_ID,
  dataset_id=DATASET_ID,
  query=query,
  column_name='Geography',
  normalized=True,
  num_samples=20000,
)


Analytics

client.get_slice

import pandas as pd

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

query = f""" SELECT * FROM "{DATASET_ID}.{MODEL_ID}" """

slice_df = client.get_slice(
    sql_query=query,
    project_id=PROJECT_ID
)
import pandas as pd

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

query = f""" SELECT * FROM "production.{MODEL_ID}" """

slice_df = client.get_slice(
    sql_query=query,
    project_id=PROJECT_ID
)

📘 Info

Only read-only SQL operations are supported. Certain SQL operations like aggregations and joins might not result in a valid slice.



Fairness

client.get_fairness

🚧 Only Binary classification models with categorical protected attributes are currently supported.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'

# Fairness - Dataset data source
fairness_metrics = client.get_fairness(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    protected_features=['feature_1', 'feature_2'],
    positive_outcome='Approved',
    score_threshold=0.6
)

# Fairness - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditSCore > 700'
fairness_metrics = client.get_fairness(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=200),
    protected_features=['feature_1', 'feature_2'],
    positive_outcome='Approved',
    score_threshold=0.6
)


Access Control

client.list_org_roles

🚧 Warning

Only administrators can use client.list_org_roles() .

client.list_org_roles()
{
    'members': [
        {
            'id': 1,
            'user': 'admin@example.com',
            'email': 'admin@example.com',
            'isLoggedIn': True,
            'firstName': 'Example',
            'lastName': 'Administrator',
            'imageUrl': None,
            'settings': {'notifyNews': True,
                'notifyAccount': True,
                'sliceTutorialCompleted': True},
            'role': 'ADMINISTRATOR'
        },
        {
            'id': 2,
            'user': 'user@example.com',
            'email': 'user@example.com',
            'isLoggedIn': True,
            'firstName': 'Example',
            'lastName': 'User',
            'imageUrl': None,
            'settings': {'notifyNews': True,
                'notifyAccount': True,
                'sliceTutorialCompleted': True},
            'role': 'MEMBER'
        }
    ],
    'invitations': [
        {
            'id': 3,
            'user': 'newuser@example.com',
            'role': 'MEMBER',
            'invited': True,
            'link': 'http://app.fiddler.ai/signup/vSQWZkt3FP--pgzmuYe_-3-NNVuR58OLZalZOlvR0GY'
        }
    ]
}

client.list_project_roles

PROJECT_ID = 'example_project'

client.list_project_roles(
    project_id=PROJECT_ID
)
{
    'roles': [
        {
            'user': {
                'email': 'admin@example.com'
            },
            'team': None,
            'role': {
                'name': 'OWNER'
            }
        },
        {
            'user': {
                'email': 'user@example.com'
            },
            'team': None,
            'role': {
                'name': 'READ'
            }
        }
    ]
}

client.list_teams

client.list_teams()
{
    'example_team': {
        'members': [
            {
                'user': 'admin@example.com',
                'role': 'MEMBER'
            },
            {
                'user': 'user@example.com',
                'role': 'MEMBER'
            }
        ]
    }
}

client.share_project

📘 Info

Administrators can share any project with any user. If you lack the required permissions to share a project, contact your organization administrator.

PROJECT_ID = 'example_project'

client.share_project(
    project_name=PROJECT_ID,
    role='READ',
    user_name='user@example.com'
)

client.unshare_project

📘 Info

Administrators and project owners can unshare any project with any user. If you lack the required permissions to unshare a project, contact your organization administrator.

PROJECT_ID = 'example_project'

client.unshare_project(
    project_name=PROJECT_ID,
    role='READ',
    user_name='user@example.com'
)


Fiddler Objects

fdl.DatasetInfo

For information on how to customize these objects, see Customizing Your Dataset Schema.

columns = [
    fdl.Column(
        name='feature_1',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='feature_2',
        data_type=fdl.DataType.INTEGER
    ),
    fdl.Column(
        name='feature_3',
        data_type=fdl.DataType.BOOLEAN
    ),
    fdl.Column(
        name='output_column',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='target_column',
        data_type=fdl.DataType.INTEGER
    )
]

dataset_info = fdl.DatasetInfo(
    display_name='Example Dataset',
    columns=columns
)

fdl.DatasetInfo.from_dataframe

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

fdl.DatasetInfo.from_dict

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

dataset_info_dict = dataset_info.to_dict()

new_dataset_info = fdl.DatasetInfo.from_dict(
    deserialized_json={
        'dataset': dataset_info_dict
    }
)

fdl.DatasetInfo.to_dict

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

dataset_info_dict = dataset_info.to_dict()
{
    'name': 'Example Dataset',
    'columns': [
        {
            'column-name': 'feature_1',
            'data-type': 'float'
        },
        {
            'column-name': 'feature_2',
            'data-type': 'int'
        },
        {
            'column-name': 'feature_3',
            'data-type': 'bool'
        },
        {
            'column-name': 'output_column',
            'data-type': 'float'
        },
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'files': []
}

fdl.ModelInfo

inputs = [
    fdl.Column(
        name='feature_1',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='feature_2',
        data_type=fdl.DataType.INTEGER
    ),
    fdl.Column(
        name='feature_3',
        data_type=fdl.DataType.BOOLEAN
    )
]

outputs = [
    fdl.Column(
        name='output_column',
        data_type=fdl.DataType.FLOAT
    )
]

targets = [
    fdl.Column(
        name='target_column',
        data_type=fdl.DataType.INTEGER
    )
]

model_info = fdl.ModelInfo(
    display_name='Example Model',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION,
    inputs=inputs,
    outputs=outputs,
    targets=targets
)

fdl.ModelInfo.from_dataset_info

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

fdl.ModelInfo.from_dict

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

model_info_dict = model_info.to_dict()

new_model_info = fdl.ModelInfo.from_dict(
    deserialized_json={
        'model': model_info_dict
    }
)

fdl.ModelInfo.to_dict

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

model_info_dict = model_info.to_dict()
{
    'name': 'Example Model',
    'input-type': 'structured',
    'model-task': 'binary_classification',
    'inputs': [
        {
            'column-name': 'feature_1',
            'data-type': 'float'
        },
        {
            'column-name': 'feature_2',
            'data-type': 'int'
        },
        {
            'column-name': 'feature_3',
            'data-type': 'bool'
        },
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'outputs': [
        {
            'column-name': 'output_column',
            'data-type': 'float'
        }
    ],
    'datasets': [],
    'targets': [
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'custom-explanation-names': []
}

Last updated

© 2024 Fiddler AI