API Methods 2.x

Connecting to Fiddler

fdl.FiddlerAPI

The Client object is used to communicate with Fiddler. In order to use the client, you'll need to provide authentication details as shown below.

For more information, see Authorizing the Client.

ParameterTypeDefaultDescription

url

str

None

The URL used to connect to Fiddler

org_id

str

None

The organization ID for a Fiddler instance. Can be found on the General tab of the Settings page.

auth_token

str

None

The authorization token used to authenticate with Fiddler. Can be found on the Credentials tab of the Settings page.

proxies

Optional [dict]

None

A dictionary containing proxy URLs.

verbose

Optional [bool]

False

If True, client calls will be logged verbosely.

verify

Optional [bool]

True

If False, client will allow self-signed SSL certificates from the Fiddler server environment. If True, the SSL certificates need to be signed by a certificate authority (CA).

🚧 Warning

If verbose is set to True, all information required for debugging will be logged, including the authorization token.

📘 Info

To maximize compatibility, please ensure that your client version matches the server version for your Fiddler instance.

When you connect to Fiddler using the code on the right, you'll receive a notification if there is a version mismatch between the client and server.

You can install a specific version of fiddler-client using pip: pip install fiddler-client==X.X.X

import fiddler as fdl

URL = 'https://app.fiddler.ai'
ORG_ID = 'my_org'
AUTH_TOKEN = 'p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58'

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN
)
import fiddler as fdl

URL = 'https://app.fiddler.ai'
ORG_ID = 'my_org'
AUTH_TOKEN = 'p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58'

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN,
    verify=False
)
proxies = {
    'http' : 'http://proxy.example.com:1234',
    'https': 'https://proxy.example.com:5678'
}

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN,
    proxies=proxies
)

If you want to authenticate with Fiddler without passing this information directly into the function call, you can store it in a file named_ fiddler.ini_, which should be stored in the same directory as your notebook or script.

%%writefile fiddler.ini

[FIDDLER]
url = https://app.fiddler.ai
org_id = my_org
auth_token = p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58
client = fdl.FiddlerApi()


Projects

Projects are used to organize your models and datasets. Each project can represent a machine learning task (e.g. predicting house prices, assessing creditworthiness, or detecting fraud).

A project can contain one or more models (e.g. lin_reg_house_predict, random_forest_house_predict).

For more information on projects, click here.


client.list_projects

response = client.list_projects()
Return TypeDescription

list

A list containing the project ID string for each project

[
  'project_a',
  'project_b',
  'project_c'
]

client.create_project

Input ParametersTypeDefaultDescription

project_id

str

None

A unique identifier for the project. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

PROJECT_ID = 'example_project'

client.create_project(
    project_id=PROJECT_ID
)
Return TypeDescription

dict

A dictionary mapping project_name to the project ID string specified, once the project is successfully created.

{
    'project_name': 'example_project'
}

client.delete_project

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

PROJECT_ID = 'example_project'

client.delete_project(
    project_id=PROJECT_ID
)
Return TypeDescription

bool

A boolean denoting whether deletion was successful.

True

🚧 Caution

You cannot delete a project without deleting the datasets and the models associated with that project.



Datasets

Datasets (or baseline datasets) are used for making comparisons with production data.

A baseline dataset should be sampled from your model's training set, so it can serve as a representation of what the model expects to see in production.

For more information, see Uploading a Baseline Dataset.

For guidance on how to design a baseline dataset, see Designing a Baseline Dataset.


client.list_datasets

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

PROJECT_ID = "example_project"

client.list_datasets(
    project_id=PROJECT_ID
)
Return TypeDescription

list

A list containing the dataset ID strings for each project.

[
    'dataset_a',
    'dataset_b',
    'dataset_c'
]

client.upload_dataset

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

dataset

dict

None

A dictionary mapping dataset slice names to pandas DataFrames.

dataset_id

str

None

A unique identifier for the dataset. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

info

Optional [fdl.DatasetInfo]

None

The Fiddler fdl.DatasetInfo() object used to describe the dataset.

size_check_enabled

Optional [bool]

True

If True, will issue a warning when a dataset has a large number of rows.

import pandas as pd

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

client.upload_dataset(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    dataset={
        'baseline': df
    },
    info=dataset_info
)
Return TypeDescription

dict

A dictionary containing information about the uploaded dataset.

{'uuid': '7046dda1-2779-4987-97b4-120e6185cc0b',
 'name': 'Ingestion dataset Upload',
 'info': {'project_name': 'example_model',
  'resource_name': 'acme_data',
  'resource_type': 'DATASET'},
 'status': 'SUCCESS',
 'progress': 100.0,
 'error_message': None,
 'error_reason': None}

client.delete_dataset

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

dataset_id

str

None

A unique identifier for the dataset.

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

client.delete_dataset(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)
Return TypeDescription

str

A message confirming that the dataset was deleted.

'Dataset deleted example_dataset'

🚧 Caution

You cannot delete a dataset without deleting the models associated with that dataset first.


client.get_dataset_info

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

dataset_id

str

None

A unique identifier for the dataset.

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

dataset_info = client.get_dataset_info(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)
Return TypeDescription

fdl.DatasetInfo

The fdl.DatasetInfo() object associated with the specified dataset.

#NA


Models

A model is a representation of your machine learning model. Each model must have an associated dataset to be used as a baseline for monitoring, explainability, and fairness capabilities.

You do not need to upload your model artifact in order to onboard your model, but doing so will significantly improve the quality of explanations generated by Fiddler.


client.add_model

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

dataset_id

str

None

The unique identifier for the dataset.

model_info

fdl.ModelInfo

None

A fdl.ModelInfo() object containing information about the model.

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

dataset_info = client.get_dataset_info(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)

model_task = fdl.ModelTask.BINARY_CLASSIFICATION
model_target = 'target_column'
model_output = 'output_column'
model_features = [
    'feature_1',
    'feature_2',
    'feature_3'
]

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    target=model_target,
    outputs=[model_output],
    model_task=model_task
)

client.add_model(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    model_info=model_info
)
Return TypeDescription

str

A message confirming that the model was added.


client.add_model_artifact

📘 Note

Before calling this function, you must have already added a model using add_model.

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model.

model_dir

str

None

A path to the directory containing all of the model files needed to run the model.

deployment_params

None

Deployment parameters object for tuning the model deployment spec. Supported from server version 23.1 and above with Model Deployment feature enabled.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.add_model_artifact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir='model_dir/',
)

client.add_model_surrogate

📘 Note

Before calling this function, you must have already added a model using add_model.

🚧 Surrogate models are not supported for input_type = fdl.ModelInputType.TEXT

Input ParameterTypeDefaultDescription

project_id

str

None

A unique identifier for the project.

model_id

str

None

A unique identifier for the model.

deployment_params

None

Deployment parameters object for tuning the model deployment spec.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.add_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

# with deployment_params
client.add_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    deployment_params=fdl.DeploymentParams(cpu=250, memory=500)
)
Return TypeDescription

None

Returns None


client.delete_model

For more information, see Uploading a Model Artifact.

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.delete_model(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

client.get_model_info

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

model_info = client.get_model_info(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)
Return TypeDescription

The ModelInfo object associated with the specified model.


client.list_models

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

PROJECT_ID = 'example_project'

client.list_models(
    project_id=PROJECT_ID
)
Return TypeDescription

list

A list containing the string ID of each model.

[
    'model_a',
    'model_b',
    'model_c'
]

client_register_model

❗️ Not supported with client 2.0 and above

Please use client.add_model() going forward.


client.trigger_pre_computation

❗️ Not supported with client 2.0 and above

This method is called automatically now when calling client.add_model_surrogate() or client.add_model_artifact().


client.update_model

For more information, see Uploading a Model Artifact.

🚧 Warning

This function does not allow for changes in a model's schema. The inputs and outputs to the model must remain the same.

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model.

model_dir

pathlib.Path

None

A path to the directory containing all of the model files needed to run the model.

force_pre_compute

bool

True

If True, re-run precomputation steps for the model. This can also be done manually by calling client.trigger_pre_computation.

import pathlib

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

model_dir = pathlib.Path('model_dir')

client.update_model(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir=model_dir
)
Return TypeDescription

bool

A boolean denoting whether the update was successful.

True

client.update_model_artifact

📘 Note

Before calling this function, you must have already added a model using add_model_surrogate or add_model_artifact

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model.

model_dir

str

None

A path to the directory containing all of the model files needed to run the model.

deployment_params

None

Deployment parameters object for tuning the model deployment spec. Supported from server version 23.1 and above with Model Deployment feature enabled.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.update_model_artifact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir='model_dir/',
)

client.update_model_package

❗️ Not supported with client 2.0 and above

Please use client.add_model_artifact() going forward.


client.update_model_surrogate

📘 Note

This method call cannot replace user uploaded model done using add_model_artifact. It can only re-generate a surrogate model

This can be used to re-generate a surrogate model for a model

Input ParameterTypeDefaultDescription

project_id

str

None

A unique identifier for the project.

model_id

str

None

A unique identifier for the model.

deployment_params

None

Deployment parameters object for tuning the model deployment spec.

wait

Optional[bool]

True

Whether to wait for async job to finish(True) or return(False).

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.update_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

# with deployment_params
client.update_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    deployment_params=fdl.DeploymentParams(cpu=250, memory=500)
)
Return TypeDescription

None

Returns None



Model Deployment

client.get_model_deployment

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

The unique identifier for the model.

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'

client.get_model_deployment(
    project_id=PROJECT_NAME,
    model_id=MODEL_NAME,
)
Return TypeDescription

dict

returns a dictionary, with all related fields for the model deployment

{
  id: 106548,
  uuid: UUID("123e4567-e89b-12d3-a456-426614174000"),
  model_id: "MODEL_NAME",
  project_id : "PROJECT_NAME",
  organization_id: "ORGANIZATION_NAME",
  artifact_type: "PYTHON_PACKAGE",
  deployment_type: "BASE_CONTAINER",
  active: True,
  image_uri: "md-base/python/machine-learning:1.0.0",
  replicas: 1,
  cpu: 250,
  memory: 512,
  created_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  updated_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  created_at: datetime(2023, 1, 27, 10, 9, 39, 793829),
  updated_at: datetime(2023, 1, 30, 17, 3, 17, 813865),
  job_uuid: UUID("539j9630-a69b-98d5-g496-326117174805")
}

client.update_model_deployment

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

The unique identifier for the model.

active

Optional [bool]

None

Set False to scale down model deployment and True to scale up.

replicas

Optional[int]

None

The number of replicas running the model.

cpu

Optional [int]

None

The amount of CPU (milli cpus) reserved per replica.

memory

Optional [int]

None

The amount of memory (mebibytes) reserved per replica.

wait

Optional[bool]

True

Whether to wait for the async job to finish (True) or not (False).

Example use cases

  • Horizontal scaling: horizontal scaling via replicas parameter. This will create multiple Kubernetes pods internally to handle requests.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    
    # Create 3 Kubernetes pods internally to handle requests
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        replicas=3,
    )
  • Vertical scaling: Model deployments support vertical scaling via cpu and memory parameters. Some models might need more memory to load the artifacts into memory or process the requests.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        cpu=500,
        memory=1024,
    )
  • Scale down: You may want to scale down the model deployments to avoid allocating the resources when the model is not in use. Use active parameters to scale down the deployment.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        active=False,
    )
  • Scale up: This will again create the model deployment Kubernetes pods with the resource values available in the database.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
          model_id=MODEL_NAME,
        active=True,
    )
Return TypeDescription

dict

returns a dictionary, with all related fields for model deployment

Supported from server version 23.1 and above with Flexible Model Deployment feature enabled.

{
  id: 106548,
  uuid: UUID("123e4567-e89b-12d3-a456-426614174000"),
  model_id: "MODEL_NAME",
  project_id : "PROJECT_NAME",
  organization_id: "ORGANIZATION_NAME",
  artifact_type: "PYTHON_PACKAGE",
  deployment_type: "BASE_CONTAINER",
  active: True,
  image_uri: "md-base/python/machine-learning:1.0.0",
  replicas: 1,
  cpu: 250,
  memory: 512,
  created_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  updated_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  created_at: datetime(2023, 1, 27, 10, 9, 39, 793829),
  updated_at: datetime(2023, 1, 30, 17, 3, 17, 813865),
  job_uuid: UUID("539j9630-a69b-98d5-g496-326117174805")
}


Event Publication

Event publication is the process of sending your model's prediction logs, or events, to the Fiddler platform. Using the Fiddler Client, events can be published in batch or streaming mode. Using these events, Fiddler will calculate metrics around feature drift, prediction drift, and model performance. These events are also stored in Fiddler to allow for ad hoc segment analysis. Please read the sections that follow to learn more about how to use the Fiddler Client for event publication.


client.publish_event

InputParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

event

dict

None

A dictionary mapping field names to field values. Any fields found that are not present in the model's ModelInfo object will be dropped from the event.

event_id

Optional [str]

None

A unique identifier for the event. If not specified, Fiddler will generate its own ID, which can be retrived using the get_slice API.

update_event

Optional [bool]

None

If True, will only modify an existing event, referenced by event_id. If no event is found, no change will take place.

event_timestamp

Optional [int]

None

The name of the timestamp input field for when the event took place. The format of this timestamp is given by timestamp_format. If no timestamp input is provided, the current time will be used.

timestamp_format

Optional [fdl.FiddlerTimestamp]

fdl.FiddlerTimestamp.INFER

The format of the timestamp passed in event_timestamp. Can be one of - fdl.FiddlerTimestamp.INFER - fdl.FiddlerTimestamp.EPOCH_MILLISECONDS - fdl.FiddlerTimestamp.EPOCH_SECONDS - fdl.FiddlerTimestamp.ISO_8601

casting_type

Optional [bool]

False

If True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.

dry_run

Optional [bool]

False

If True, the event will not be published, and instead a report will be generated with information about any problems with the event. Useful for debugging issues with event publishing.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

example_event = {
    'feature_1': 20.7,
    'feature_2': 45000,
    'feature_3': True,
    'output_column': 0.79,
    'target_column': 1
}

client.publish_event(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    event=example_event,
    event_id='event_001',
    event_timestamp=1637344470000
)
Return TypeDescription

str

returns a string with a UUID acknowledging that the event was successfully received.

'66cfbeb6-5651-4e8b-893f-90286f435b8d'

client.publish_events_batch

InputParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model.

batch_source

Union[pd.Dataframe, str]

None

Either a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are CSV (.csv) Parquet (.pq) Pickled DataFrame (.pkl)

id_field

Optional [str]

None

The field containing event IDs for events in the batch. If not specified, Fiddler will generate its own ID, which can be retrived using the get_slice API.

update_event

Optional [bool]

None

If True, will only modify an existing event, referenced by id_field. If an ID is provided for which there is no event, no change will take place.

timestamp_field

Optional [str]

None

The field containing timestamps for events in the batch. The format of these timestamps is given by timestamp_format. If no timestamp is provided for a given row, the current time will be used.

timestamp_format

Optional [fdl.FiddlerTimestamp]

fdl.FiddlerTimestamp.INFER

The format of the timestamp passed in event_timestamp. Can be one of -fdl.FiddlerTimestamp.INFER - fdl.FiddlerTimestamp.EPOCH_MILLISECONDS -fdl.FiddlerTimestamp.EPOCH_SECONDS - fdl.FiddlerTimestamp.ISO_8601

data_source

Optional [fdl.BatchPublishType]

None

The location of the data source provided. By default, Fiddler will try to infer the value. Can be one of - fdl.BatchPublishType.DATAFRAME - fdl.BatchPublishType.LOCAL_DISK - fdl.BatchPublishType.AWS_S3

casting_type

Optional [bool]

False

If True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.

credentials

Optional [dict]

None

A dictionary containing authorization information for AWS or GCP. For AWS, the expected keys are - 'aws_access_key_id' - 'aws_secret_access_key' - 'aws_session_token' For GCP, the expected keys are - 'gcs_access_key_id' - 'gcs_secret_access_key' - 'gcs_session_token'

group_by

Optional [str]

None

The field used to group events together when computing performance metrics (for ranking models only).

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

df_events = pd.read_csv('events.csv')

client.publish_events_batch(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        batch_source=df_events,
        id_field='event_id',
        timestamp_field='inference_date')

In this example, event_id and inference_date are columns in df_events. Both are optional. If not passed, we generate unique UUID and use current timestamp for event_timestmap.


PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

df_to_update = pd.read_csv('events_update.csv')

# event_id is a column in df_to_update
client.publish_events_batch(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        update_event=True,
        batch_source=df_to_update,
        id_field='event_id')

In case of update-event, id_field is required as a unique identifier of the previous published events. For more details on which columns are eligible to be updated, refer to Updating Events.


Return TypeDescription

dict

A dictionary object which reports the result of the batch publication.

{'status': 202,
 'job_uuid': '4ae7bd3a-2b3f-4444-b288-d51e07b6736d',
 'files': ['ssoqj_tmpzmczjuob.csv'],
 'message': 'Successfully received the event data. Please allow time for the event ingestion to complete in the Fiddler platform.'}

client.publish_events_batch_schema

| Input | Parameter | Type | Default | Description | | --- | --- | --- | --- | | batch_source | Union[pd.Dataframe, str] | None | Either a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are - CSV (.csv) | | publish_schema | dict | None | A dictionary used for locating fields within complex or nested data structures. | | data_source | Optional [fdl.BatchPublishType] | None | The location of the data source provided. By default, Fiddler will try to infer the value. Can be one of - fdl.BatchPublishType.DATAFRAME - fdl.BatchPublishType.LOCAL_DISK - fdl.BatchPublishType.AWS_S3 | | credentials | Optional [dict] | None | A dictionary containing authorization information for AWS or GCP. For AWS, the expected keys are - 'aws_access_key_id' - 'aws_secret_access_key' - 'aws_session_token'For GCP, the expected keys are - 'gcs_access_key_id' - 'gcs_secret_access_key' - 'gcs_session_token' | | group_by | Optional [str] | None | The field used to group events together when computing performance metrics (for ranking models only). |

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

path_to_batch = 'events_batch.avro'

schema = {
    '__static': {
        '__project': PROJECT_ID,
        '__model': MODEL_ID
    },
    '__dynamic': {
        'feature_1': 'features/feature_1',
        'feature_2': 'features/feature_2',
        'feature_3': 'features/feature_3',
        'output_column': 'outputs/output_column',
        'target_column': 'targets/target_column'
      ORG = '__org'
13      MODEL = '__model'
14      PROJECT = '__project'
15      TIMESTAMP = '__timestamp'
16      DEFAULT_TIMESTAMP = '__default_timestamp'
17      TIMESTAMP_FORMAT = '__timestamp_format'
18      EVENT_ID = '__event_id'
19      IS_UPDATE_EVENT = '__is_update_event'
20      STATUS = '__status'
21      LATENCY = '__latency'
22      ITERATOR_KEY = '__iterator_key'
    }
}

client.publish_events_batch_schema(
    batch_source=path_to_batch,
    publish_schema=schema
)
Return TypeDescription

dict

A dictionary object which reports the result of the batch publication.

{'status': 202,
 'job_uuid': '5ae7bd3a-2b3f-4444-b288-d51e098a01d',
 'files': ['rroqj_tmpzmczjttb.csv'],
 'message': 'Successfully received the event data. Please allow time for the event ingestion to complete in the Fiddler platform.'}


Baselines

client.add_baseline

| Input | Parameter | Type | Required | Description | | --- | --- | --- | --- | | project_id | string | Yes | The unique identifier for the project | | model_id | string | Yes | The unique identifier for the model | | baseline_id | string | Yes | The unique identifier for the baseline | | type | fdl.BaselineType | Yes | one of : PRE_PRODUCTION STATIC_PRODUCTION ROLLING_PRODUCTION | | dataset_id | string | No | Training or validation dataset uploaded to Fiddler for a PRE_PRODUCTION baseline | | start_time | int | No | seconds since epoch to be used as the start time for STATIC_PRODUCTION baseline | | end_time | int | No | seconds since epoch to be used as the end time for STATIC_PRODUCTION baseline | | offset | fdl.WindowSize | No | offset in seconds relative to the current time to be used for ROLLING_PRODUCTION baseline | | window_size | fdl.WindowSize | No | width of the window in seconds to be used for ROLLING_PRODUCTION baseline |

Add a pre-production baseline

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_pre'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'


client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.PRE_PRODUCTION,
  dataset_id=DATASET_NAME,
)

Add a static production baseline

from datetime import datetime
from fiddler import BaselineType, WindowSize

start = datetime(2023, 1, 1, 0, 0) # 12 am, 1st Jan 2023
end = datetime(2023, 1, 2, 0, 0) # 12 am, 2nd Jan 2023

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_static'
DATASET_NAME = 'example_dataset'
MODEL_NAME = 'example_model'
START_TIME = start.timestamp()
END_TIME = end.timestamp()


client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.STATIC_PRODUCTION,
  start_time=START_TIME,
  end_time=END_TIME,
)

Add a rolling time window baseline

from fiddler import BaselineType, WindowSize

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_rolling'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'

client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.ROLLING_PRODUCTION,
  offset=WindowSize.ONE_MONTH, # How far back to set our window
  window_size=WindowSize.ONE_WEEK, # Size of the sliding window
)
Return TypeDescription

Baseline schema object with all the configuration parameters


client.get_baseline

get_baseline helps get the configuration parameters of the existing baseline

Input ParameterTypeRequiredDescription

project_id

string

Yes

The unique identifier for the project

model_id

string

Yes

The unique identifier for the model

baseline_id

string

Yes

The unique identifier for the baseline

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'
BASELINE_NAME = 'example_preconfigured'


baseline = client.get_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
)
Return TypeDescription

Baseline schema object with all the configuration parameters


client.list_baselines

Gets all the baselines in a project or attached to a single model within a project

Input ParameterTypeRequiredDescription

project_id

string

Yes

The unique identifier for the project

model_id

string

No

The unique identifier for the model

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'

# list baselines across all models within a project
client.list_baselines(
  project_id=ROJECT_NAME
)

# list baselines within a model
client.list_baselines(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
)
Return TypeDescription

List of baseline config objects


client.delete_baseline

Deletes an existing baseline from a project

Input ParameterTypeRequiredDescription

project_id

string

Yes

The unique identifier for the project

model_id

string

Yes

The unique identifier for the model

baseline_id

string

Yes

The unique identifier for the baseline

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'
BASELINE_NAME = 'example_preconfigured'


client.delete_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
)


Monitoring

client.add_monitoring_config

Input ParametersTypeDefaultDescription

config_info

dict

None

Monitoring config info for an entire org or a project or a model.

project_id

Optional [str]

None

The unique identifier for the project.

model_id

Optional [str]

None

The unique identifier for the model.

📘 Info

add_monitoring_config can be applied at the model, project, or organization level.

  • If project_id and model_id are specified, the configuration will be applied at the model level.

  • If project_id is specified but model_id is not, the configuration will be applied at the project level.

  • If neither project_id nor model_id are specified, the configuration will be applied at the organization level.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

monitoring_config = {
    'min_bin_value': 3600,
    'time_ranges': ['Day', 'Week', 'Month', 'Quarter', 'Year'],
    'default_time_range': 7200
}

client.add_monitoring_config(
    config_info=monitoring_config,
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

client.add_alert_rule

| Input | Parameters | Type | Default | Description | | --- | --- | --- | --- | | name | str | None | A name for the alert rule | | project_id | str | None | The unique identifier for the project. | | model_id | str | None | The unique identifier for the model. | | alert_type | fdl.AlertType | None | One of AlertType.PERFORMANCE, AlertType.DATA_DRIFT, AlertType.DATA_INTEGRITY, AlertType.SERVICE_METRICS, or AlertType.STATISTIC | | metric | fdl.Metric | None | When alert_type is AlertType.SERVICE_METRICS this should be Metric.TRAFFIC. When alert_type is AlertType.PERFORMANCE, choose one of the following based on the ML model task: For binary_classfication: Metric.ACCURACY Metric.TPR Metric.FPR Metric.PRECISION Metric.RECALL Metric.F1_SCORE Metric.ECE Metric.AUC For regression: Metric.R2 Metric.MSE Metric.MAE Metric.MAPE Metric.WMAPE For multi-class classification: Metric.ACCURACY Metric.LOG_LOSS For ranking: Metric.MAP Metric.MEAN_NDCG When alert_type is AlertType.DATA_DRIFT choose one of the following: Metric.PSI Metric.JSD When alert_type is AlertType.DATA_INTEGRITY choose one of the following: Metric.RANGE_VIOLATION Metric.MISSING_VALUE Metric.TYPE_VIOLATION When alert_type is AlertType.STATISTIC choose one of the following: Metric.AVERAGE Metric.SUM Metric.FREQUENCY | | bin_size | fdl.BinSize | ONE_DAY | Duration for which the metric value is calculated. Choose one of the following: BinSize.ONE_HOUR BinSize.ONE_DAY BinSize.SEVEN_DAYS | | compare_to | fdl.CompareTo | None | Whether the metric value compared against a static value or the same bin from a previous time period. CompareTo.RAW_VALUE CompareTo.TIME_PERIOD. | | compare_period | fdl.ComparePeriod | None | Required only when CompareTo is TIME_PERIOD. Choose one of the following: ComparePeriod.ONE_DAY ComparePeriod.SEVEN_DAYS ComparePeriod.ONE_MONTH ComparePeriod.THREE_MONTHS | | priority | fdl.Priority | None | Priority.LOW Priority.MEDIUM Priority.HIGH | | warning_threshold | float | None | [Optional] Threshold value to crossing which a warning level severity alert will be triggered. This should be a decimal which represents a percentage (e.g. 0.45). | | critical_threshold | float | None | Threshold value to crossing which a critical level severity alert will be triggered. This should be a decimal which represents a percentage (e.g. 0.45). | | condition | fdl.AlertCondition | None | Specifies if the rule should trigger if the metric is greater than or less than the thresholds. AlertCondition.LESSER AlertCondition.GREATER | | notifications_config | Dict[str, Dict[str, Any]] | None | [Optional] notifications config object created using helper method build_notifications_config() | | columns | List[str] | None | Column names on which alert rule is to be created. Applicable only when alert_type is AlertType.DATA_INTEGRITY or AlertType.DRIFT. When alert type is AlertType.DATA_INTEGRITY, it can take *[***ANY***]* to check for all columns. | | baseline_id | str | None | Name of the baseline whose histogram is compared against the same derived from current data. When no baseline_id is specified then the default baseline is used. Used only when alert type is AlertType.DATA_DRIFT. | | segment | str | None | The segment to alert on. See Segments for more details. |

📘 Info

The Fiddler client can be used to create a variety of alert rules. Rules can be of Data Drift, Performance, Data Integrity, and **Service Metrics ** types and they can be compared to absolute (compare_to = RAW_VALUE) or to relative values (compare_to = TIME_PERIOD).

# To add a Performance type alert rule which triggers an email notification
# when precision metric is 5% higher than that from 1 hr bin one day ago.

import fiddler as fdl

notifications_config = client.build_notifications_config(
    emails = "user_1@abc.com, user_2@abc.com",
)
client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.PERFORMANCE,
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

# To add Data Integrity alert rule which triggers an email notification when
# published events have more than 5 null values in any 1 hour bin for the _age_ column.
# Notice compare_to = fdl.CompareTo.RAW_VALUE.

import fiddler as fdl

client.add_alert_rule(
    name = "age-null-1hr",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_INTEGRITY,
    metric = fdl.Metric.MISSING_VALUE,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    priority = fdl.Priority.HIGH,
    warning_threshold = 5,
    critical_threshold = 10,
    condition = fdl.AlertCondition.GREATER,
    column = "age",
    notifications_config = notifications_config
)
# To add a Data Drift type alert rule which triggers an email notification
# when PSI metric for 'age' column from an hr is 5% higher than that from 'baseline_name' dataset.

import fiddler as fdl

client.add_baseline(project_id='project-a',
                    model_id='model-a',
                    baseline_name='baseline_name',
                    type=fdl.BaselineType.PRE_PRODUCTION,
                    dataset_id='dataset-a')

notifications_config = client.build_notifications_config(
    emails = "user_1@abc.com, user_2@abc.com",
)

client.add_alert_rule(
    name = "psi-gt-5prec-age-baseline_name",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_DRIFT,
    metric = fdl.Metric.PSI,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config,
    columns = ["age"],
    baseline_id = 'baseline_name'
)
# To add Drift type alert rule which triggers an email notification when
# value of JSD metric is more than 0.5 for one hour bin for  _age_ or _gender_ columns.
# Notice compare_to = fdl.CompareTo.RAW_VALUE.

import fiddler as fdl
notifications_config = client.build_notifications_config(
    emails = "user_1@abc.com, user_2@abc.com",
)

client.add_alert_rule(
    name = "jsd_multi_col_1hr",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_DRIFT,
    metric = fdl.Metric.JSD,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    warning_threshold = 0.4,
    critical_threshold = 0.5,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config,
    columns = ["age", "gender"],
)
# To add Data Integrity alert rule which triggers an email notification when