API Methods 2.x

Connecting to Fiddler

fdl.FiddlerAPI

The Client object is used to communicate with Fiddler. In order to use the client, you'll need to provide authentication details as shown below.

For more information, see Authorizing the Client.

ParameterTypeDefaultDescription

url

str

None

The URL used to connect to Fiddler

org_id

str

None

The organization ID for a Fiddler instance. Can be found on the General tab of the Settings page.

auth_token

str

None

The authorization token used to authenticate with Fiddler. Can be found on the Credentials tab of the Settings page.

proxies

Optional [dict]

None

A dictionary containing proxy URLs.

verbose

Optional [bool]

False

If True, client calls will be logged verbosely.

verify

Optional [bool]

True

If False, client will allow self-signed SSL certificates from the Fiddler server environment. If True, the SSL certificates need to be signed by a certificate authority (CA).

🚧 Warning

If verbose is set to True, all information required for debugging will be logged, including the authorization token.

📘 Info

To maximize compatibility, please ensure that your Client Version matches the server version for your Fiddler instance.

When you connect to Fiddler using the code on the right, you'll receive a notification if there is a version mismatch between the client and server.

You can install a specific version of fiddler-client using pip: pip install fiddler-client==X.X.X

import fiddler as fdl

URL = 'https://app.fiddler.ai'
ORG_ID = 'my_org'
AUTH_TOKEN = 'p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58'

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN
)
import fiddler as fdl

URL = 'https://app.fiddler.ai'
ORG_ID = 'my_org'
AUTH_TOKEN = 'p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58'

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN,
    verify=False
)
proxies = {
    'http' : 'http://proxy.example.com:1234',
    'https': 'https://proxy.example.com:5678'
}

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN,
    proxies=proxies
)

If you want to authenticate with Fiddler without passing this information directly into the function call, you can store it in a file named_ fiddler.ini_, which should be stored in the same directory as your notebook or script.

%%writefile fiddler.ini

[FIDDLER]
url = https://app.fiddler.ai
org_id = my_org
auth_token = p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58
client = fdl.FiddlerApi()


Projects

Projects are used to organize your models and datasets. Each project can represent a machine learning task (e.g. predicting house prices, assessing creditworthiness, or detecting fraud).

A project can contain one or more models (e.g. lin_reg_house_predict, random_forest_house_predict).

For more information on projects, click here.


client.list_projects

response = client.list_projects()
Return TypeDescription

list

A list containing the project ID string for each project

[
  'project_a',
  'project_b',
  'project_c'
]

client.create_project

Input ParametersTypeDefaultDescription

project_id

str

None

A unique identifier for the project. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

PROJECT_ID = 'example_project'

client.create_project(
    project_id=PROJECT_ID
)
Return TypeDescription

dict

A dictionary mapping project_name to the project ID string specified, once the project is successfully created.

{
    'project_name': 'example_project'
}

client.delete_project

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

PROJECT_ID = 'example_project'

client.delete_project(
    project_id=PROJECT_ID
)
Return TypeDescription

bool

A boolean denoting whether deletion was successful.

True

🚧 Caution

You cannot delete a project without deleting the datasets and the models associated with that project.



Datasets

Datasets (or baseline datasets) are used for making comparisons with production data.

A baseline dataset should be sampled from your model's training set, so it can serve as a representation of what the model expects to see in production.

For more information, see Uploading a Baseline Dataset.

For guidance on how to design a baseline dataset, see Designing a Baseline Dataset.


client.list_datasets

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

PROJECT_ID = "example_project"

client.list_datasets(
    project_id=PROJECT_ID
)
Return TypeDescription

list

A list containing the dataset ID strings for each project.

[
    'dataset_a',
    'dataset_b',
    'dataset_c'
]

client.upload_dataset

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

dataset

dict

None

A dictionary mapping dataset slice names to pandas DataFrames.

dataset_id

str

None

A unique identifier for the dataset. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

info

Optional [fdl.DatasetInfo]

None

The Fiddler fdl.DatasetInfo() object used to describe the dataset.

size_check_enabled

Optional [bool]

True

If True, will issue a warning when a dataset has a large number of rows.

import pandas as pd

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

client.upload_dataset(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    dataset={
        'baseline': df
    },
    info=dataset_info
)
Return TypeDescription

dict

A dictionary containing information about the uploaded dataset.

{'uuid': '7046dda1-2779-4987-97b4-120e6185cc0b',
 'name': 'Ingestion dataset Upload',
 'info': {'project_name': 'example_model',
  'resource_name': 'acme_data',
  'resource_type': 'DATASET'},
 'status': 'SUCCESS',
 'progress': 100.0,
 'error_message': None,
 'error_reason': None}

client.delete_dataset

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

dataset_id

str

None

A unique identifier for the dataset.

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

client.delete_dataset(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)
Return TypeDescription

str

A message confirming that the dataset was deleted.

'Dataset deleted example_dataset'

🚧 Caution

You cannot delete a dataset without deleting the models associated with that dataset first.


client.get_dataset_info

Input ParametersTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

dataset_id

str

None

A unique identifier for the dataset.

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

dataset_info = client.get_dataset_info(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)
Return TypeDescription

fdl.DatasetInfo

The fdl.DatasetInfo() object associated with the specified dataset.

#NA


Models

A model is a representation of your machine learning model. Each model must have an associated dataset to be used as a baseline for monitoring, explainability, and fairness capabilities.

You do not need to upload your model artifact in order to onboard your model, but doing so will significantly improve the quality of explanations generated by Fiddler.


client.add_model

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

dataset_id

str

None

The unique identifier for the dataset.

model_info

fdl.ModelInfo

None

A fdl.ModelInfo() object containing information about the model.

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

dataset_info = client.get_dataset_info(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)

model_task = fdl.ModelTask.BINARY_CLASSIFICATION
model_target = 'target_column'
model_output = 'output_column'
model_features = [
    'feature_1',
    'feature_2',
    'feature_3'
]

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    target=model_target,
    outputs=[model_output],
    model_task=model_task
)

client.add_model(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    model_info=model_info
)
Return TypeDescription

str

A message confirming that the model was added.


client.add_model_artifact

📘 Note

Before calling this function, you must have already added a model using add_model.

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model.

model_dir

str

None

A path to the directory containing all of the model files needed to run the model.

deployment_params

None

Deployment parameters object for tuning the model deployment spec. Supported from server version 23.1 and above with Model Deployment feature enabled.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.add_model_artifact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir='model_dir/',
)

client.add_model_surrogate

📘 Note

Before calling this function, you must have already added a model using add_model.

🚧 Surrogate models are not supported for input_type = fdl.ModelInputType.TEXT

Input ParameterTypeDefaultDescription

project_id

str

None

A unique identifier for the project.

model_id

str

None

A unique identifier for the model.

deployment_params

None

Deployment parameters object for tuning the model deployment spec.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.add_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

# with deployment_params
client.add_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    deployment_params=fdl.DeploymentParams(cpu=250, memory=500)
)
Return TypeDescription

None

Returns None


client.delete_model

For more information, see Uploading a Model Artifact.

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.delete_model(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

client.get_model_info

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

model_info = client.get_model_info(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)
Return TypeDescription

The ModelInfo object associated with the specified model.


client.list_models

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

PROJECT_ID = 'example_project'

client.list_models(
    project_id=PROJECT_ID
)
Return TypeDescription

list

A list containing the string ID of each model.

[
    'model_a',
    'model_b',
    'model_c'
]

client_register_model

❗️ Not supported with client 2.0 and above

Please use client.add_model() going forward.


client.trigger_pre_computation

❗️ Not supported with client 2.0 and above

This method is called automatically now when calling client.add_model_surrogate() or client.add_model_artifact().


client.update_model

For more information, see Uploading a Model Artifact.

🚧 Warning

This function does not allow for changes in a model's schema. The inputs and outputs to the model must remain the same.

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model.

model_dir

pathlib.Path

None

A path to the directory containing all of the model files needed to run the model.

force_pre_compute

bool

True

If True, re-run precomputation steps for the model. This can also be done manually by calling client.trigger_pre_computation.

import pathlib

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

model_dir = pathlib.Path('model_dir')

client.update_model(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir=model_dir
)
Return TypeDescription

bool

A boolean denoting whether the update was successful.

True

client.update_model_artifact

📘 Note

Before calling this function, you must have already added a model using add_model_surrogate or add_model_artifact

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model.

model_dir

str

None

A path to the directory containing all of the model files needed to run the model.

deployment_params

None

Deployment parameters object for tuning the model deployment spec. Supported from server version 23.1 and above with Model Deployment feature enabled.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.update_model_artifact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir='model_dir/',
)

client.update_model_package

❗️ Not supported with client 2.0 and above

Please use client.add_model_artifact() going forward.


client.update_model_surrogate

📘 Note

This method call cannot replace user uploaded model done using add_model_artifact. It can only re-generate a surrogate model

This can be used to re-generate a surrogate model for a model

Input ParameterTypeDefaultDescription

project_id

str

None

A unique identifier for the project.

model_id

str

None

A unique identifier for the model.

deployment_params

None

Deployment parameters object for tuning the model deployment spec.

wait

Optional[bool]

True

Whether to wait for async job to finish(True) or return(False).

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.update_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

# with deployment_params
client.update_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    deployment_params=fdl.DeploymentParams(cpu=250, memory=500)
)
Return TypeDescription

None

Returns None



Model Deployment

client.get_model_deployment

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

The unique identifier for the model.

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'

client.get_model_deployment(
    project_id=PROJECT_NAME,
    model_id=MODEL_NAME,
)
Return TypeDescription

dict

returns a dictionary, with all related fields for the model deployment

{
  id: 106548,
  uuid: UUID("123e4567-e89b-12d3-a456-426614174000"),
  model_id: "MODEL_NAME",
  project_id : "PROJECT_NAME",
  organization_id: "ORGANIZATION_NAME",
  artifact_type: "PYTHON_PACKAGE",
  deployment_type: "BASE_CONTAINER",
  active: True,
  image_uri: "md-base/python/python-311:1.0.0",
  replicas: 1,
  cpu: 250,
  memory: 512,
  created_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  updated_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  created_at: datetime(2023, 1, 27, 10, 9, 39, 793829),
  updated_at: datetime(2023, 1, 30, 17, 3, 17, 813865),
  job_uuid: UUID("539j9630-a69b-98d5-g496-326117174805")
}

client.update_model_deployment

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

The unique identifier for the model.

active

Optional [bool]

None

Set False to scale down model deployment and True to scale up.

replicas

Optional[int]

None

The number of replicas running the model.

cpu

Optional [int]

None

The amount of CPU (milli cpus) reserved per replica.

memory

Optional [int]

None

The amount of memory (mebibytes) reserved per replica.

wait

Optional[bool]

True

Whether to wait for the async job to finish (True) or not (False).

Example use cases

  • Horizontal scaling: horizontal scaling via replicas parameter. This will create multiple Kubernetes pods internally to handle requests.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    
    # Create 3 Kubernetes pods internally to handle requests
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        replicas=3,
    )
  • Vertical scaling: Model deployments support vertical scaling via cpu and memory parameters. Some models might need more memory to load the artifacts into memory or process the requests.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        cpu=500,
        memory=1024,
    )
  • Scale down: You may want to scale down the model deployments to avoid allocating the resources when the model is not in use. Use active parameters to scale down the deployment.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        active=False,
    )
  • Scale up: This will again create the model deployment Kubernetes pods with the resource values available in the database.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
          model_id=MODEL_NAME,
        active=True,
    )
Return TypeDescription

dict

returns a dictionary, with all related fields for model deployment

Supported from server version 23.1 and above with Flexible Model Deployment feature enabled.

{
  id: 106548,
  uuid: UUID("123e4567-e89b-12d3-a456-426614174000"),
  model_id: "MODEL_NAME",
  project_id : "PROJECT_NAME",
  organization_id: "ORGANIZATION_NAME",
  artifact_type: "PYTHON_PACKAGE",
  deployment_type: "BASE_CONTAINER",
  active: True,
  image_uri: "md-base/python/python-311:1.0.0 ",
  replicas: 1,
  cpu: 250,
  memory: 512,
  created_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  updated_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "example_email@gmail.com",
  },
  created_at: datetime(2023, 1, 27, 10, 9, 39, 793829),
  updated_at: datetime(2023, 1, 30, 17, 3, 17, 813865),
  job_uuid: UUID("539j9630-a69b-98d5-g496-326117174805")
}


Event Publication

Event publication is the process of sending your model's prediction logs, or events, to the Fiddler platform. Using the Fiddler Client, events can be published in batch or streaming mode. Using these events, Fiddler will calculate metrics around feature drift, prediction drift, and model performance. These events are also stored in Fiddler to allow for ad hoc segment analysis. Please read the sections that follow to learn more about how to use the Fiddler Client for event publication.


client.publish_event

InputParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.

event

dict

None

A dictionary mapping field names to field values. Any fields found that are not present in the model's ModelInfo object will be dropped from the event.

event_id

Optional [str]

None

A unique identifier for the event. If not specified, Fiddler will generate its own ID, which can be retrived using the get_slice API.

update_event

Optional [bool]

None

If True, will only modify an existing event, referenced by event_id. If no event is found, no change will take place.

event_timestamp

Optional [int]

None

The name of the timestamp input field for when the event took place. If no timestamp input is provided, the current time will be used.

casting_type

Optional [bool]

False

If True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.

dry_run

Optional [bool]

False

If True, the event will not be published, and instead a report will be generated with information about any problems with the event. Useful for debugging issues with event publishing.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

example_event = {
    'feature_1': 20.7,
    'feature_2': 45000,
    'feature_3': True,
    'output_column': 0.79,
    'target_column': 1
}

client.publish_event(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    event=example_event,
    event_id='event_001',
    event_timestamp=1637344470000
)
Return TypeDescription

str

returns a string with a UUID acknowledging that the event was successfully received.

'66cfbeb6-5651-4e8b-893f-90286f435b8d'

client.publish_events_batch

InputParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model.

batch_source

Union[pd.Dataframe, str]

None

Either a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are CSV (.csv) Parquet (.pq) Pickled DataFrame (.pkl)

id_field

Optional [str]

None

The field containing event IDs for events in the batch. If not specified, Fiddler will generate its own ID, which can be retrived using the get_slice API.

update_event

Optional [bool]

None

If True, will only modify an existing event, referenced by id_field. If an ID is provided for which there is no event, no change will take place.

timestamp_field

Optional [str]

None

The field containing timestamps for events in the batch. If no timestamp is provided for a given row, the current time will be used.

data_source

Optional [fdl.BatchPublishType]

None

The location of the data source provided. By default, Fiddler will try to infer the value. Can be one of - fdl.BatchPublishType.DATAFRAME - fdl.BatchPublishType.LOCAL_DISK - fdl.BatchPublishType.AWS_S3

casting_type

Optional [bool]

False

If True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.

credentials

Optional [dict]

None

A dictionary containing authorization information for AWS or GCP. For AWS, the expected keys are - 'aws_access_key_id' - 'aws_secret_access_key' - 'aws_session_token' For GCP, the expected keys are - 'gcs_access_key_id' - 'gcs_secret_access_key' - 'gcs_session_token'

group_by

Optional [str]

None

The field used to group events together when computing performance metrics (for ranking models only).

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

df_events = pd.read_csv('events.csv')

client.publish_events_batch(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        batch_source=df_events,
        id_field='event_id',
        timestamp_field='inference_date')

In this example, event_id and inference_date are columns in df_events. Both are optional. If not passed, we generate unique UUID and use current timestamp for event_timestmap.


PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

df_to_update = pd.read_csv('events_update.csv')

# event_id is a column in df_to_update
client.publish_events_batch(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        update_event=True,
        batch_source=df_to_update,
        id_field='event_id')

In case of update-event, id_field is required as a unique identifier of the previous published events. For more details on which columns are eligible to be updated, refer to Updating Events.


Return TypeDescription

dict

A dictionary object which reports the result of the batch publication.

{'status': 202,
 'job_uuid': '4ae7bd3a-2b3f-4444-b288-d51e07b6736d',
 'files': ['ssoqj_tmpzmczjuob.csv'],
 'message': 'Successfully received the event data. Please allow time for the event ingestion to complete in the Fiddler platform.'}


Baselines

client.add_baseline

Input ParameterTypeRequiredDescription

project_id

string

Yes

The unique identifier for the project

model_id

string

Yes

The unique identifier for the model

baseline_id

string

Yes

The unique identifier for the baseline

type

Yes

one of : PRE_PRODUCTION STATIC_PRODUCTION ROLLING_PRODUCTION

dataset_id

string

No

Training or validation dataset uploaded to Fiddler for a PRE_PRODUCTION baseline

start_time

int

No

seconds since epoch to be used as the start time for STATIC_PRODUCTION baseline

end_time

int

No

seconds since epoch to be used as the end time for STATIC_PRODUCTION baseline

offset

No

offset in seconds relative to the current time to be used for ROLLING_PRODUCTION baseline

window_size

No

width of the window in seconds to be used for ROLLING_PRODUCTION baseline

Add a pre-production baseline

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_pre'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'


client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.PRE_PRODUCTION,
  dataset_id=DATASET_NAME,
)

Add a static production baseline

from datetime import datetime
from fiddler import BaselineType, WindowSize

start = datetime(2023, 1, 1, 0, 0) # 12 am, 1st Jan 2023
end = datetime(2023, 1, 2, 0, 0) # 12 am, 2nd Jan 2023

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_static'
DATASET_NAME = 'example_dataset'
MODEL_NAME = 'example_model'
START_TIME = start.timestamp()
END_TIME = end.timestamp()


client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.STATIC_PRODUCTION,
  start_time=START_TIME,
  end_time=END_TIME,
)

Add a rolling time window baseline

from fiddler import BaselineType, WindowSize

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_rolling'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'

client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.ROLLING_PRODUCTION,
  offset=WindowSize.ONE_MONTH, # How far back to set our window
  window_size=WindowSize.ONE_WEEK, # Size of the sliding window
)
Return TypeDescription

Baseline schema object with all the configuration parameters


client.get_baseline

get_baseline helps get the configuration parameters of the existing baseline

Input ParameterTypeRequiredDescription

project_id

string

Yes

The unique identifier for the project

model_id

string

Yes

The unique identifier for the model

baseline_id

string

Yes

The unique identifier for the baseline

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'
BASELINE_NAME = 'example_preconfigured'


baseline = client.get_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
)
Return TypeDescription

Baseline schema object with all the configuration parameters


client.list_baselines

Gets all the baselines in a project or attached to a single model within a project

Input ParameterTypeRequiredDescription

project_id

string

Yes

The unique identifier for the project

model_id

string

No

The unique identifier for the model

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'

# list baselines across all models within a project
client.list_baselines(
  project_id=ROJECT_NAME
)

# list baselines within a model
client.list_baselines(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
)
Return TypeDescription

List of baseline config objects


client.delete_baseline

Deletes an existing baseline from a project

Input ParameterTypeRequiredDescription

project_id

string

Yes

The unique identifier for the project

model_id

string

Yes

The unique identifier for the model

baseline_id

string

Yes

The unique identifier for the baseline

PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'
BASELINE_NAME = 'example_preconfigured'


client.delete_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
)


Monitoring

client.add_monitoring_config

Input ParametersTypeDefaultDescription

config_info

dict

None

Monitoring config info for an entire org or a project or a model.

project_id

Optional [str]

None

The unique identifier for the project.

model_id

Optional [str]

None

The unique identifier for the model.

📘 Info

add_monitoring_config can be applied at the model, project, or organization level.

  • If project_id and model_id are specified, the configuration will be applied at the model level.

  • If project_id is specified but model_id is not, the configuration will be applied at the project level.

  • If neither project_id nor model_id are specified, the configuration will be applied at the organization level.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

monitoring_config = {
    'min_bin_value': 3600,
    'time_ranges': ['Day', 'Week', 'Month', 'Quarter', 'Year'],
    'default_time_range': 7200
}

client.add_monitoring_config(
    config_info=monitoring_config,
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

client.add_alert_rule

Input ParametersTypeDefaultDescription

name

str

None

A name for the alert rule

project_id

str

None

The unique identifier for the project.

model_id

str

None

The unique identifier for the model.

alert_type

None

One of AlertType.PERFORMANCE, AlertType.DATA_DRIFT, AlertType.DATA_INTEGRITY, AlertType.SERVICE_METRICS, or AlertType.STATISTIC

metric

None

When alert_type is AlertType.SERVICE_METRICS this should be Metric.TRAFFIC. When alert_type is AlertType.PERFORMANCE, choose one of the following based on the ML model task: For binary_classfication: Metric.ACCURACY Metric.TPR Metric.FPR Metric.PRECISION Metric.RECALL Metric.F1_SCORE Metric.ECE Metric.AUC For regression: Metric.R2 Metric.MSE Metric.MAE Metric.MAPE Metric.WMAPE For multi-class classification: Metric.ACCURACY Metric.LOG_LOSS For ranking: Metric.MAP Metric.MEAN_NDCG When alert_type is AlertType.DATA_DRIFT choose one of the following: Metric.PSI Metric.JSD When alert_type is AlertType.DATA_INTEGRITY choose one of the following: Metric.RANGE_VIOLATION Metric.MISSING_VALUE Metric.TYPE_VIOLATION When alert_type is AlertType.STATISTIC choose one of the following: Metric.AVERAGE Metric.SUM Metric.FREQUENCY

bin_size

ONE_DAY

Duration for which the metric value is calculated. Choose one of the following: BinSize.ONE_HOUR BinSize.ONE_DAY BinSize.SEVEN_DAYS

compare_to

None

Whether the metric value compared against a static value or the same bin from a previous time period. CompareTo.RAW_VALUE CompareTo.TIME_PERIOD.

compare_period

None

Required only when CompareTo is TIME_PERIOD. Choose one of the following: ComparePeriod.ONE_DAY ComparePeriod.SEVEN_DAYS ComparePeriod.ONE_MONTH ComparePeriod.THREE_MONTHS

priority

None

Priority.LOW Priority.MEDIUM Priority.HIGH

warning_threshold

float

None

[Optional] Threshold value to crossing which a warning level severity alert will be triggered. This should be a decimal which represents a percentage (e.g. 0.45).

critical_threshold

float

None

Threshold value to crossing which a critical level severity alert will be triggered. This should be a decimal which represents a percentage (e.g. 0.45).

condition

None

Specifies if the rule should trigger if the metric is greater than or less than the thresholds. AlertCondition.LESSER AlertCondition.GREATER

notifications_config

Dict[str, Dict[str, Any]]

None

[Optional] notifications config object created using helper method build_notifications_config()

columns

List[str]

None

Column names on which alert rule is to be created. Applicable only when alert_type is AlertType.DATA_INTEGRITY or AlertType.DRIFT. When alert type is AlertType.DATA_INTEGRITY, it can take *[***ANY***]* to check for all columns.

baseline_id

str

None

Name of the baseline whose histogram is compared against the same derived from current data. When no baseline_id is specified then the default baseline is used. Used only when alert type is AlertType.DATA_DRIFT.

segment

str

None

The segment to alert on. See Segments for more details.

📘 Info

The Fiddler client can be used to create a variety of alert rules. Rules can be of Data Drift, Performance, Data Integrity, and **Service Metrics ** types and they can be compared to absolute (compare_to = RAW_VALUE) or to relative values (compare_to = TIME_PERIOD).

# To add a Performance type alert rule which triggers an email notification
# when precision metric is 5% higher than that from 1 hr bin one day ago.

import fiddler as fdl

notifications_config = client.build_notifications_config(
    emails = "user_1@abc.com, user_2@abc.com",
)
client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.PERFORMANCE,
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

# To add Data Integrity alert rule which triggers an email notification when
# published events have more than 5 null values in any 1 hour bin for the _age_ column.
# Notice compare_to = fdl.CompareTo.RAW_VALUE.

import fiddler as fdl

client.add_alert_rule(
    name = "age-null-1hr",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_INTEGRITY,
    metric = fdl.Metric.MISSING_VALUE,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    priority = fdl.Priority.HIGH,
    warning_threshold = 5,
    critical_threshold = 10,
    condition = fdl.AlertCondition.GREATER,
    column = "age",
    notifications_config = notifications_config
)
# To add a Data Drift type alert rule which triggers an email notification
# when PSI metric for 'age' column from an hr is 5% higher than that from 'baseline_name' dataset.

import fiddler as fdl

client.add_baseline(project_id='project-a',
                    model_id='model-a',
                    baseline_name='baseline_name',
                    type=fdl.BaselineType.PRE_PRODUCTION,
                    dataset_id='dataset-a')

notifications_config = client.build_notifications_config(
    emails = "user_1@abc.com, user_2@abc.com",
)

client.add_alert_rule(
    name = "psi-gt-5prec-age-baseline_name",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_DRIFT,
    metric = fdl.Metric.PSI,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config,
    columns = ["age"],
    baseline_id = 'baseline_name'
)
# To add Drift type alert rule which triggers an email notification when
# value of JSD metric is more than 0.5 for one hour bin for  _age_ or _gender_ columns.
# Notice compare_to = fdl.CompareTo.RAW_VALUE.

import fiddler as fdl
notifications_config = client.build_notifications_config(
    emails = "user_1@abc.com, user_2@abc.com",
)

client.add_alert_rule(
    name = "jsd_multi_col_1hr",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_DRIFT,
    metric = fdl.Metric.JSD,
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    warning_threshold = 0.4,
    critical_threshold = 0.5,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config,
    columns = ["age", "gender"],
)
# To add Data Integrity alert rule which triggers an email notification when
# published events have more than 5 percent null values in any 1 hour bin for the _age_ column.

import fiddler as fdl

client.add_alert_rule(
    name = "age_null_percentage_greater_than_10",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_INTEGRITY,
    metric = 'null_violation_percentage',
    bin_size = fdl.BinSize.ONE_HOUR,
    compare_to = fdl.CompareTo.RAW_VALUE,
    priority = fdl.Priority.HIGH,
    warning_threshold = 5,
    critical_threshold = 10,
    condition = fdl.AlertCondition.GREATER,
    column = "age",
    notifications_config = notifications_config
)
Return TypeDescription

Alert Rule

Created Alert Rule object

Example responses:

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE,
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]
AlertRule(alert_rule_uuid='e1aefdd5-ef22-4e81-b869-3964eff8b5cd',
organization_name='some_org_name',
project_id='project-a',
model_id='model-a',
name='age-null-1hr',
alert_type=AlertType.DATA_INTEGRITY,
metric=Metric.MISSING_VALUE,
column='age',
priority=Priority.HIGH,
compare_to=CompareTo.RAW_VALUE,
compare_period=None,
warning_threshold=5,
critical_threshold=10,
condition=AlertCondition.GREATER,
bin_size=BinSize.ONE_HOUR)
AlertRule(alert_rule_uuid='e1aefdd5-ef22-4e81-b869-3964eff8b5cd',
organization_name='some_org_name',
project_id='project-a',
model_id='model-a',
name='psi-gt-5prec-age-baseline_name',
alert_type=AlertType.DATA_DRIFT,
metric=Metric.PSI,
priority=Priority.HIGH,
compare_to=CompareTo.RAW_VALUE,
compare_period=None,
warning_threshold=5,
critical_threshold=10,
condition=AlertCondition.GREATER,
bin_size=BinSize.ONE_HOUR,
columns=['age'],
baseline_id='baseline_name')
[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.DRIFT,
           metric=Metric.JSD,
           priority=Priority.HIGH,
           compare_to='CompareTo.RAW_VALUE,
           compare_period=ComparePeriod.ONE_HOUR,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.4,
           critical_threshold=0.5,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR,
           columns=['age', 'gender'])]

client.get_alert_rules

Input ParametersTypeDefaultDescription

project_id

Optional [str]

None

A unique identifier for the project.

model_id

Optional [str]

None

A unique identifier for the model.

alert_type

Optional[fdl.AlertType]

None

Alert type. One of: AlertType.PERFORMANCE, AlertType.DATA_DRIFT, AlertType.DATA_INTEGRITY, or AlertType.SERVICE_METRICS

metric

Optional[fdl.Metric]

None

When alert_type is SERVICE_METRICS: Metric.TRAFFIC. When alert_type is PERFORMANCE, choose one of the following based on the machine learning model. 1) For binary_classfication: One of Metric.ACCURACY, Metric.TPR, Metric.FPR, Metric.PRECISION, Metric.RECALL, Metric.F1_SCORE, Metric.ECE, Metric.AUC 2) For Regression: One of Metric.R2, Metric.MSE, Metric.MAE, Metric.MAPE, Metric.WMAPE 3) For Multi-class: Metric.ACCURACY, Metric.LOG_LOSS 4) For Ranking: Metric.MAP, Metric.MEAN_NDCG When alert_type is DRIFT: Metric.PSI or Metric.JSD When alert_type is DATA_INTEGRITY: One of Metric.RANGE_VIOLATION, Metric.MISSING_VALUE, Metric.TYPE_VIOLATION

columns

Optional[List[str]]

None

[Optional] List of column names on which alert rule was created. Please note, Alert Rule matching any columns from this list will be returned.

offset

Optional[int]

None

Pointer to the starting of the page index

limit

Optional[int]

None

Number of records to be retrieved per page, also referred as page_size

ordering

Optional[List[str]]

None

List of Alert Rule fields to order by. Eg. [‘critical_threshold’] or [‘- critical_threshold’] for descending order.

📘 Info

The Fiddler client can be used to get a list of alert rules with respect to the filtering parameters.


import fiddler as fdl

alert_rules = client.get_alert_rules(
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_INTEGRITY,
    metric = fdl.Metric.MISSING_VALUE,
    columns = ["age", "gender"],
    ordering = ['critical_threshold'], #['-critical_threshold'] for descending
    limit= 4, ## to set number of rules to show in one go
    offset = 0, # page offset (multiple of limit)
)
Return TypeDescription

List[AlertRule]

A List containing AlertRule objects returned by the query.


client.get_triggered_alerts

Input ParametersTypeDefaultDescription

alert_rule_uuid

str

None

The unique system generated identifier for the alert rule.

start_time

Optional[datetime]

7 days ago

Start time to filter trigger alerts in yyyy-MM-dd format, inclusive.

end_time

Optional[datetime]

today

End time to filter trigger alerts in yyyy-MM-dd format, inclusive.

offset

Optional[int]

None

Pointer to the starting of the page index

limit

Optional[int]

None

Number of records to be retrieved per page, also referred as page_size

ordering

Optional[List[str]]

None

List of Alert Rule fields to order by. Eg. [‘alert_time_bucket’] or [‘- alert_time_bucket’] for descending order.

📘 Info

The Fiddler client can be used to get a list of triggered alerts for given alert rule and time duration.


trigerred_alerts = client.get_triggered_alerts(
    alert_rule_uuid = "588744b2-5757-4ae9-9849-1f4e076a58de",
    start_time = "2022-05-01",
    end_time = "2022-09-30",
    ordering = ['alert_time_bucket'], #['-alert_time_bucket'] for descending
    limit= 4, ## to set number of rules to show in one go
    offset = 0, # page offset
)
Return TypeDescription

List[TriggeredAlerts]

A List containing TriggeredAlerts objects returned by the query.


client.delete_alert_rule

Input ParametersTypeDefaultDescription

alert_rule_uuid

str

None

The unique system generated identifier for the alert rule.

📘 Info

The Fiddler client can be used to get a list of triggered alerts for given alert rule and time duration.


client.delete_alert_rule(
    alert_rule_uuid = "588744b2-5757-4ae9-9849-1f4e076a58de",
)
Return TypeDescription

None


client.build_notifications_config

Input ParametersTypeDefaultDescription

emails

Optional[str]

None

Comma separated emails list

pagerduty_services

Optional[str]

None

Comma separated pagerduty services list

pagerduty_severity

Optional[str]

None

Severity for the alerts triggered by pagerduty

webhooks

Optional[List[str]]

None

Comma separated valid uuids of webhooks available

📘 Info

The Fiddler client can be used to build notification configuration to be used while creating alert rules.


notifications_config = client.build_notifications_config(
    emails = "name@abc.com",
)
notifications_config = client.build_notifications_config(
  emails = "name1@abc.com,name2@email.com",
  pagetduty_services = 'pd_service_1',
  pagerduty_severity = 'critical'
)
notifications_config = client.build_notifications_config(
    webhooks = ["894d76e8-2268-4c2e-b1c7-5561da6f84ae", "3814b0ac-b8fe-4509-afc9-ae86c176ef13"]
)
Return TypeDescription

Dict[str, Dict[str, Any]]:

dict with emails and pagerduty dict. If left unused, will store empty string for these values

Example Response:

{'emails': {'email': 'name@abc.com'}, 'pagerduty': {'service': '', 'severity': ''}, 'webhooks': []}

client.add_webhook

Input ParametersTypeDefaultDescription

name

str

None

A unique name for the webhook.

URL

str

None

The webhook url is used for sending notification messages.

provider

str

None

The platform provides webhooks functionality. Only ‘SLACK’ is supported.


client.add_webhook(
        name='range_violation_channel',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
        provider='SLACK')
)
Return TypeDescription

Details of the webhook created.

Example responses:

Webhook(uuid='df2397d3-23a8-4eb3-987a-2fe43b758b08',
        name='range_violation_channel', organization_name='some_org_name',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
        provider='SLACK')

📘 Add Slack webhook

Use the Slack API reference to generate a webhook for your Slack App


client.delete_webhook

Input ParametersTypeDefaultDescription

uuid

str

None

The unique system generated identifier for the webook.


client.delete_webhook(
    uuid = "ffcc2ddf-f896-41f0-bc50-4e7b76bb9ace",
)
Return TypeDescription

None


client.get_webhook

Input ParametersTypeDefaultDescription

uuid

str

None

The unique system generated identifier for the webook.


client.get_webhook(
    alert_rule_uuid = "a5f085bc-6772-4eff-813a-bfc20ff71002",
)
Return TypeDescription

Details of Webhook.

Example responses:

Webhook(uuid='a5f085bc-6772-4eff-813a-bfc20ff71002',
        name='binary_classification_alerts_channel',
        organization_name='some_org',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d,
        provider='SLACK')

client.get_webhooks

Input ParametersTypeDefaultDescription

limit

Optional[int]

300

Number of records to be retrieved per page.

offset

Optional[int]

0

Pointer to the starting of the page index.

response = client.get_webhooks()
Return TypeDescription

A List containing webhooks.

Example Response

[
  Webhook(uuid='e20bf4cc-d2cf-4540-baef-d96913b14f1b', name='model_1_alerts', organization_name='some_org', url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d', provider='SLACK'),
    Webhook(uuid='bd4d02d7-d1da-44d7-b194-272b4351cff7', name='drift_alerts_channel', organization_name='some_org', url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d', provider='SLACK'),
    Webhook(uuid='761da93b-bde2-4c1f-bb17-bae501abd511', name='project_1_alerts', organization_name='some_org', url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d', provider='SLACK')
]

client.update_webhook

Input ParametersTypeDefaultDescription

name

str

None

A unique name for the webhook.

url

str

None

The webhook url used for sending notification messages.

provider

str

None

The platform that provides webhooks functionality. Only ‘SLACK’ is supported.

uuid

str

None

The unique system generated identifier for the webook.

client.update_webhook(uuid='e20bf4cc-d2cf-4540-baef-d96913b14f1b',
                      name='drift_violation',
                      url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
                      provider='SLACK')
Return TypeDescription

Details of Webhook after modification.

Example Response:

Webhook(uuid='e20bf4cc-d2cf-4540-baef-d96913b14f1b',
        name='drift_violation', organization_name='some_org_name',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
        provider='SLACK')

client.update_alert_notification_status

Input ParametersTypeDefaultDescription

notification_status

bool

None

The status of notification for the alerts.

alert_config_ids

Optional[List[str]]

None

List of Alert Ids that we want to update.

model_id

Optional[str]

None

The Model Id for which we want to update all alerts.

📘 Info

The Fiddler client can be used to update the notification status of multiple alerts at once.


updated_alert_configs = client.update_alert_notification_status(
    notification_status = True,
    model_id = "9f8180d3-3fa0-40c4-8656-b9b1d2de1b69",
)
updated_alert_configs = client.update_alert_notification_status(
    notification_status = True,
    alert_config_ids = ["9b8711fa-735e-4a72-977c-c4c8b16543ae"],
)
Return TypeDescription

List[AlertRule]

List of Alert Rules updated from this method.

Example responses:

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE,
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]


Custom Metrics

client.get_custom_metric

Input ParameterTypeRequiredDescription

metric_id

string

Yes

The unique identifier for the custom metric

METRIC_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

custom_metric = client.get_custom_metric(
  metric_id=METRIC_ID
)
Return TypeDescription

fiddler.schema.custom_metric.CustomMetric

Custom metric object with details about the metric


client.get_custom_metrics

Input ParameterTypeDefaultRequiredDescription

project_id

string

Yes

The unique identifier for the project

model_id

string

Yes

The unique identifier for the model

limit

Optional[int]

300

No

Maximum number of items to return

offset

Optional[int]

0

No

Number of items to skip before returning

PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

custom_metrics = client.get_custom_metrics(
  project_id=PROJECT_ID,
  model_id=MODEL_ID
)
Return TypeDescription

List[fiddler.schema.custom_metric.CustomMetric]

List of custom metric objects for the given model


client.add_custom_metric

For details on supported constants, operators, and functions, see Fiddler Query Language.

Input ParameterTypeRequiredDescription

name

string

Yes

Name of the custom metric

project_id

string

Yes

The unique identifier for the project

model_id

string

Yes

The unique identifier for the model

definition

string

Yes

The FQL metric definition for the custom metric

description

string

No

A description of the custom metric

PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

definition = """
    average(if(Prediction < 0.5 and Target == 1, -40, if(Prediction >= 0.5 and Target == 0, -400, 250)))
"""

client.add_custom_metric(
    name='Loan Value',
    description='A custom value score assigned to a loan',
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    definition=definition
)

client.delete_custom_metric

Input ParameterTypeRequiredDescription

metric_id

string

Yes

The unique identifier for the custom metric

METRIC_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

client.delete_custom_metric(
  metric_id=METRIC_ID
)


Segments

client.get_segment

Input ParameterTypeRequiredDescription

segment_id

string

Yes

The unique identifier for the segment

SEGMENT_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

segment = client.get_segment(
  segment_id=SEGMENT_ID
)
Return TypeDescription

fdl.Segment

Segment object with details about the segment


client.get_segments

Input ParameterTypeDefaultRequiredDescription

project_id

string

Yes

The unique identifier for the project

model_id

string

Yes

The unique identifier for the model

limit

Optional[int]

300

No

Maximum number of items to return

offset

Optional[int]

0

No

Number of items to skip before returning

PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

custom_metrics = client.get_segments(
  project_id=PROJECT_ID,
  model_id=MODEL_ID
)
Return TypeDescription

List[fdl.Segment]

List of segment objects for the given model


client.add_segment

For details on supported constants, operators, and functions, see Fiddler Query Language.

Input ParameterTypeRequiredDescription

name

string

Yes

Name of the segment

project_id

string

Yes

The unique identifier for the project

model_id

string

Yes

The unique identifier for the model

definition

string

Yes

The FQL metric definition for the segment

description

string

No

A description of the segment

PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

definition = """
    age > 50
"""

client.add_segment(
    name='Over 50',
    description='All people over the age of 50',
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    definition=definition
)
Segment(
  id='50a1c32d-c2b4-4faf-9006-f4aeadd7a859',
  name='Over 50',
  project_name='my_project',
  organization_name='mainbuild',
  definition='age > 50',
  description='All people over the age of 50',
  created_at=None,
  created_by=None
)

client.delete_segment

Input ParameterTypeRequiredDescription

segment_id

string

Yes

The unique identifier for the segment

SEGMENT_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

client.delete_segment(
  segment_id=SEGMENT_ID
)


Explainability

client.get_predictions

Input ParameterTypeDefaultDescription

project_id

str

None

A unique identifier for the project.

model_id

str

None

A unique identifier for the model.

input_df

pd.DataFrame

None

A pandas DataFrame containing model input vectors as rows.

chunk_size

Optional[int]

10000

The chunk size for fetching predictions. Default is 10_000 rows chunk.

import pandas as pd

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

input_df = pd.read_csv('example_data.csv')

# Example without chunk size specified:
predictions = client.get_predictions(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_df=input_df,
)


# Example with chunk size specified:
predictions = client.get_predictions(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_df=input_df,
    chunk_size=1000,
)
Return TypeDescription

pd.DataFrame

A pandas DataFrame containing model predictions for the given input vectors.


client.get_explanation

Input ParameterTypeDefaultDescription

project_id

str

None

A unique identifier for the project.

model_id

str

None

A unique identifier for the model.

input_data_source

None

Type of data source for the input dataset to compute explanation on (RowDataSource, EventIdDataSource). A single row explanation is currently supported.

ref_data_source

None

Type of data source for the reference data to compute explanation on (DatasetDataSource, SqlSliceQueryDataSource). Only used for non-text models and the following methods: 'SHAP', 'FIDDLER_SHAP', 'PERMUTE', 'MEAN_RESET'

explanation_type

Optional[str]

'FIDDLER_SHAP'

Explanation method name. Could be your custom explanation method or one of the following method: 'SHAP', 'FIDDLER_SHAP', 'IG', 'PERMUTE', 'MEAN_RESET', 'ZERO_RESET'

num_permutations

Optional[int]

300

- For Fiddler SHAP, num_permutations corresponds to the number of coalitions to sample to estimate the Shapley values of each single-reference game. - For the permutation algorithms, num_permutations corresponds to the number of permutations from the dataset to use for the computation.

ci_level

Optional[float]

0.95

The confidence level (between 0 and 1).

top_n_class

Optional[int]

None

For multi-class classification models only, specifying if only the n top classes are computed or all classes (when parameter is None).

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset

# FIDDLER SHAP - Dataset reference data source
row = df.to_dict(orient='records')[0]
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    ref_data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=300),
    explanation_type='FIDDLER_SHAP',
    num_permutations=200,
    ci_level=0.95,
)

# FIDDLER SHAP - Slice ref data source
row = df.to_dict(orient='records')[0]
query = f'SELECT * from {DATASET_ID}.{MODEL_ID} WHERE sulphates >= 0.8'
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    ref_data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=100),
    explanation_type='FIDDLER_SHAP',
    num_permutations=200,
    ci_level=0.95,
)

# FIDDLER SHAP - Multi-class classification (top classes)
row = df.to_dict(orient='records')[0]
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    ref_data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID),
    explanation_type='FIDDLER_SHAP',
    top_n_class=2
)

# IG (Not available by default, need to be enabled via package.py)
row = df.to_dict(orient='records')[0]
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    explanation_type='IG',
)
Return TypeDescription

tuple

A named tuple with the explanation results.


client.get_feature_impact

Input ParameterTypeDefaultDescription

project_id

str

None

A unique identifier for the project.

model_id

str

None

A unique identifier for the model.

data_source

None

Type of data source for the input dataset to compute feature impact on (DatasetDataSource or SqlSliceQueryDataSource)

num_iterations

Optional[int]

10000

The maximum number of ablated model inferences per feature. Used for TABULAR data only.

num_refs

Optional[int]

10000

Number of reference points used in the explanation. Used for TABULAR data only.

ci_level

Optional[float]

0.95

The confidence level (between 0 and 1). Used for TABULAR data only.

output_columns

Optional[List[str]]

None

Only used for NLP (TEXT inputs) models. Output column names to compute feature impact on. Useful for Multi-class Classification models. If None, compute for all output columns.

min_support

Optional[int]

15

Only used for NLP (TEXT inputs) models. Specify a minimum support (number of times a specific word was present in the sample data) to retrieve top words. Default to 15.

overwrite_cache

Optional[bool]

False

Whether to overwrite the feature impact cached values or not.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'

# Feature Impact for TABULAR data - Dataset Data Source
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Impact for TABULAR data - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=80),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Impact for TEXT data
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=50),
    output_columns= ['probability_A', 'probability_B'],
    min_support=30
)
Return TypeDescription

tuple

A named tuple with the feature impact results.


client.get_feature_importance

Input ParameterTypeDefaultDescription

project_id

str

None

A unique identifier for the project.

model_id

str

None

A unique identifier for the model.

data_source

None

Type of data source for the input dataset to compute feature importance on (DatasetDataSource or SqlSliceQueryDataSource)

num_iterations

Optional[int]

10000

The maximum number of ablated model inferences per feature.

num_refs

Optional[int]

10000

Number of reference points used in the explanation.

ci_level

Optional[float]

0.95

The confidence level (between 0 and 1).

overwrite_cache

Optional[bool]

False

Whether to overwrite the feature importance cached values or not

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'


# Feature Importance - Dataset data source
feature_importance = client.get_feature_importance(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Importance - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
feature_importance = client.get_feature_importance(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=80),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)
Return TypeDescription

tuple

A named tuple with the feature impact results.


client.get_mutual_information

Input ParameterTypeDefaultDescription

project_id

str

None

A unique identifier for the project.

dataset_id

str

None

A unique identifier for the dataset.

query

str

None

Slice query to compute Mutual information on.

column_name

str

None

Column name to compute mutual information with respect to all the columns in the dataset.

normalized

Optional[bool]

False

If set to True, it will compute Normalized Mutual Information.

num_samples

Optional[int]

10000

Number of samples to select for computation.

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
mutual_info = client.get_mutual_information(
  project_id=PROJECT_ID,
  dataset_id=DATASET_ID,
  query=query,
  column_name='Geography',
  normalized=True,
  num_samples=20000,
)
Return TypeDescription

dict

A dictionary with the mutual information results.



Analytics

client.get_slice

Input ParameterTypeDefaultDescription

sql_query

str

None

The SQL query used to retrieve the slice.

project_id

str

None

The unique identifier for the project. The model and/or the dataset to be queried within the project are designated in the sql_query itself.

columns_override

Optional [list]

None

A list of columns to include in the slice, even if they aren't specified in the query.

import pandas as pd

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

query = f""" SELECT * FROM "{DATASET_ID}.{MODEL_ID}" """

slice_df = client.get_slice(
    sql_query=query,
    project_id=PROJECT_ID
)
import pandas as pd

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

query = f""" SELECT * FROM "production.{MODEL_ID}" """

slice_df = client.get_slice(
    sql_query=query,
    project_id=PROJECT_ID
)
Return TypeDescription

pd.DataFrame

A pandas DataFrame containing the slice returned by the query.

📘 Info

Only read-only SQL operations are supported. Certain SQL operations like aggregations and joins might not result in a valid slice.



Fairness

client.get_fairness

🚧 Only Binary classification models with categorical protected attributes are currently supported.

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

model_id

str

None

The unique identifier for the model.

data_source

None

DataSource for the input dataset to compute fairness on (DatasetDataSource or SqlSliceQueryDataSource).

protected_features

list[str]

None

A list of protected features.

positive_outcome

Union[str, int, float, bool]

None

Value of the positive outcome (from the target column) for Fairness analysis.

score_threshold

Optional [float]

0.5

The score threshold used to calculate model outcomes.

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'

# Fairness - Dataset data source
fairness_metrics = client.get_fairness(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    protected_features=['feature_1', 'feature_2'],
    positive_outcome='Approved',
    score_threshold=0.6
)

# Fairness - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditSCore > 700'
fairness_metrics = client.get_fairness(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=200),
    protected_features=['feature_1', 'feature_2'],
    positive_outcome='Approved',
    score_threshold=0.6
)
Return TypeDescription

dict

A dictionary containing fairness metric results.



Access Control

client.list_org_roles

🚧 Warning

Only administrators can use client.list_org_roles() .

client.list_org_roles()
Return TypeDescription

dict

A dictionary of users and their roles in the organization.

{
    'members': [
        {
            'id': 1,
            'user': 'admin@example.com',
            'email': 'admin@example.com',
            'isLoggedIn': True,
            'firstName': 'Example',
            'lastName': 'Administrator',
            'imageUrl': None,
            'settings': {'notifyNews': True,
                'notifyAccount': True,
                'sliceTutorialCompleted': True},
            'role': 'ADMINISTRATOR'
        },
        {
            'id': 2,
            'user': 'user@example.com',
            'email': 'user@example.com',
            'isLoggedIn': True,
            'firstName': 'Example',
            'lastName': 'User',
            'imageUrl': None,
            'settings': {'notifyNews': True,
                'notifyAccount': True,
                'sliceTutorialCompleted': True},
            'role': 'MEMBER'
        }
    ],
    'invitations': [
        {
            'id': 3,
            'user': 'newuser@example.com',
            'role': 'MEMBER',
            'invited': True,
            'link': 'http://app.fiddler.ai/signup/vSQWZkt3FP--pgzmuYe_-3-NNVuR58OLZalZOlvR0GY'
        }
    ]
}

client.list_project_roles

Input ParaemterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

PROJECT_ID = 'example_project'

client.list_project_roles(
    project_id=PROJECT_ID
)
Return TypeDescription

dict

A dictionary of users and their roles for the specified project.

{
    'roles': [
        {
            'user': {
                'email': 'admin@example.com'
            },
            'team': None,
            'role': {
                'name': 'OWNER'
            }
        },
        {
            'user': {
                'email': 'user@example.com'
            },
            'team': None,
            'role': {
                'name': 'READ'
            }
        }
    ]
}

client.list_teams

client.list_teams()
Return TypeDescription

dict

A dictionary containing information about teams and users.

{
    'example_team': {
        'members': [
            {
                'user': 'admin@example.com',
                'role': 'MEMBER'
            },
            {
                'user': 'user@example.com',
                'role': 'MEMBER'
            }
        ]
    }
}

client.share_project

📘 Info

Administrators can share any project with any user. If you lack the required permissions to share a project, contact your organization administrator.

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

role

str

None

The permissions role being shared. Can be one of - 'READ' - 'WRITE' - 'OWNER'

user_name

Optional [str]

None

A username with which the project will be shared. Typically an email address.

team_name

Optional [str]

None

A team with which the project will be shared.

PROJECT_ID = 'example_project'

client.share_project(
    project_name=PROJECT_ID,
    role='READ',
    user_name='user@example.com'
)

client.unshare_project

📘 Info

Administrators and project owners can unshare any project with any user. If you lack the required permissions to unshare a project, contact your organization administrator.

Input ParameterTypeDefaultDescription

project_id

str

None

The unique identifier for the project.

role

str

None

The permissions role being revoked. Can be one of - 'READ' - 'WRITE' - 'OWNER'

user_name

Optional [str]

None

A username with which the project will be revoked. Typically an email address.

team_name

Optional [str]

None

A team with which the project will be revoked.

PROJECT_ID = 'example_project'

client.unshare_project(
    project_name=PROJECT_ID,
    role='READ',
    user_name='user@example.com'
)


Fiddler Objects

fdl.DatasetInfo

For information on how to customize these objects, see Customizing Your Dataset Schema.

Input ParametersTypeDefaultDescription

display_name

str

None

A display name for the dataset.

columns

list

None

A list of fdl.Column objects containing information about the columns.

files

Optional [list]

None

A list of strings pointing to CSV files to use.

dataset_id

Optional [str]

None

The unique identifier for the dataset

**kwargs

Additional arguments to be passed.

columns = [
    fdl.Column(
        name='feature_1',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='feature_2',
        data_type=fdl.DataType.INTEGER
    ),
    fdl.Column(
        name='feature_3',
        data_type=fdl.DataType.BOOLEAN
    ),
    fdl.Column(
        name='output_column',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='target_column',
        data_type=fdl.DataType.INTEGER
    )
]

dataset_info = fdl.DatasetInfo(
    display_name='Example Dataset',
    columns=columns
)

fdl.DatasetInfo.from_dataframe

Input ParametersTypeDefaultDescription

df

Union [pd.Dataframe, list]

Either a single pandas DataFrame or a list of DataFrames. If a list is given, all dataframes must have the same columns.

display_name

str

' '

A display_name for the dataset

max_inferred_cardinality

Optional [int]

100

If specified, any string column containing fewer than max_inferred_cardinality unique values will be converted to a categorical data type.

dataset_id

Optional [str]

None

The unique identifier for the dataset

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)
Return TypeDescription

fdl.DatasetInfo

A fdl.DatasetInfo() object constructed from the pandas Dataframe provided.


fdl.DatasetInfo.from_dict

Input ParametersTypeDefaultDescription

deserialized_json

dict

The dictionary object to be converted

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

dataset_info_dict = dataset_info.to_dict()

new_dataset_info = fdl.DatasetInfo.from_dict(
    deserialized_json={
        'dataset': dataset_info_dict
    }
)
Return TypeDescription

fdl.DatasetInfo

A fdl.DatasetInfo() object constructed from the dictionary.


fdl.DatasetInfo.to_dict

Return TypeDescription

dict

A dictionary containing information from the fdl.DatasetInfo() object.

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

dataset_info_dict = dataset_info.to_dict()
{
    'name': 'Example Dataset',
    'columns': [
        {
            'column-name': 'feature_1',
            'data-type': 'float'
        },
        {
            'column-name': 'feature_2',
            'data-type': 'int'
        },
        {
            'column-name': 'feature_3',
            'data-type': 'bool'
        },
        {
            'column-name': 'output_column',
            'data-type': 'float'
        },
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'files': []
}

fdl.ModelInfo

Input ParametersTypeDefaultDescription

display_name

str

A display name for the model.

input_type

fdl.ModelInputType

A ModelInputType object containing the input type of the model.

model_task

fdl.ModelTask

A ModelTask object containing the model task.

inputs

list

A list of Column objects corresponding to the inputs (features) of the model.

outputs

list

A list of Column objects corresponding to the outputs (predictions) of the model.

metadata

Optional [list]

None

A list of Column objects corresponding to any metadata fields.

decisions

Optional [list]

None

A list of Column objects corresponding to any decision fields (post-prediction business decisions).

targets

Optional [list]

None

A list of Column objects corresponding to the targets (ground truth) of the model.

framework

Optional [str]

None

A string providing information about the software library and version used to train and run this model.

description

Optional [str]

None

A description of the model.

datasets

Optional [list]

None

A list of the dataset IDs used by the model.

mlflow_params

Optional [fdl.MLFlowParams]

None

A MLFlowParams object containing information about MLFlow parameters.

model_deployment_params

Optional [fdl.ModelDeploymentParams]

None

A ModelDeploymentParams object containing information about model deployment.

artifact_status

Optional [fdl.ArtifactStatus]

None

An ArtifactStatus object containing information about the model artifact.

preferred_explanation_method

Optional [fdl.ExplanationMethod]

None

An ExplanationMethod object that specifies the default explanation algorithm to use for the model.

custom_explanation_names

Optional [list]

[ ]

A list of names that can be passed to the explanation_name _argument of the optional user-defined _explain_custom method of the model object defined in package.py.

binary_classification_threshold

Optional [float]

.5

The threshold used for classifying inferences for binary classifiers.

ranking_top_k

Optional [int]

50

Used only for ranking models. Sets the top k results to take into consideration when computing performance metrics like MAP and NDCG.

group_by

Optional [str]

None

Used only for ranking models. The column by which to group events for certain performance metrics like MAP and NDCG.

fall_back

Optional [dict]

None

A dictionary mapping a column name to custom missing value encodings for that column.

target_class_order

Optional [list]

None

A list denoting the order of classes in the target. This parameter is required in the following cases: - Binary classification tasks: If the target is of type string, you must tell Fiddler which class is considered the positive class for your output column. You need to provide a list with two elements. The 0th element by convention is considered the negative class, and the 1st element is considered the positive class. When your target is boolean, you don't need to specify this argument. By default Fiddler considers True as the positive class. In case your target is numerical, you don't need to specify this argument, by default Fiddler considers the higher of the two possible values as the positive class. - Multi-class classification tasks: You must tell Fiddler which class corresponds to which output by giving an ordered list of classes. This order should be the same as the order of the outputs. - Ranking tasks: If the target is of type string, you must provide a list of all the possible target values in the order of relevance. The first element will be considered as the least relevant grade and the last element from the list will be considered the most relevant grade. In the case your target is numerical, Fiddler considers the smallest value to be the least relevant grade and the biggest value from the list will be considered the most relevant grade.

**kwargs

Additional arguments to be passed.

inputs = [
    fdl.Column(
        name='feature_1',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='feature_2',
        data_type=fdl.DataType.INTEGER
    ),
    fdl.Column(
        name='feature_3',
        data_type=fdl.DataType.BOOLEAN
    )
]

outputs = [
    fdl.Column(
        name='output_column',
        data_type=fdl.DataType.FLOAT
    )
]

targets = [
    fdl.Column(
        name='target_column',
        data_type=fdl.DataType.INTEGER
    )
]

model_info = fdl.ModelInfo(
    display_name='Example Model',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION,
    inputs=inputs,
    outputs=outputs,
    targets=targets
)

fdl.ModelInfo.from_dataset_info

Input ParametersTypeDefaultDescription

dataset_info

The DatasetInfo object from which to construct the ModelInfo object.

target

str

The column to be used as the target (ground truth).

model_task

None

A ModelTask object containing the model task.

dataset_id

Optional [str]

None

The unique identifier for the dataset.

features

Optional [list]

None

A list of columns to be used as features.

custom_features

Optional[List[CustomFeature]]

None

List of Custom Features definitions for a model. Objects of type Multivariate, Vector, ImageEmbedding or TextEmbedding derived from CustomFeature can be provided.

metadata_cols

Optional [list]

None

A list of columns to be used as metadata fields.

decision_cols

Optional [list]

None

A list of columns to be used as decision fields.

display_name

Optional [str]

None

A display name for the model.

description

Optional [str]

None

A description of the model.

input_type

Optional [fdl.ModelInputType]

fdl.ModelInputType.TABULAR

A ModelInputType object containing the input type of the model.

outputs

Optional [list]

A list of Column objects corresponding to the outputs (predictions) of the model.

targets

Optional [list]

None

A list of Column objects corresponding to the targets (ground truth) of the model.

model_deployment_params

Optional [fdl.ModelDeploymentParams]

None

A ModelDeploymentParams object containing information about model deployment.

framework

Optional [str]

None

A string providing information about the software library and version used to train and run this model.

datasets

Optional [list]

None

A list of the dataset IDs used by the model.

mlflow_params

Optional [fdl.MLFlowParams]

None

A MLFlowParams object containing information about MLFlow parameters.

preferred_explanation_method

Optional [fdl.ExplanationMethod]

None

An ExplanationMethod object that specifies the default explanation algorithm to use for the model.

custom_explanation_names

Optional [list]

[ ]

A list of names that can be passed to the explanation_name _argument of the optional user-defined _explain_custom method of the model object defined in package.py.

binary_classification_threshold

Optional [float]

.5

The threshold used for classifying inferences for binary classifiers.

ranking_top_k

Optional [int]

50

Used only for ranking models. Sets the top k results to take into consideration when computing performance metrics like MAP and NDCG.

group_by

Optional [str]

None

Used only for ranking models. The column by which to group events for certain performance metrics like MAP and NDCG.

fall_back

Optional [dict]

None

A dictionary mapping a column name to custom missing value encodings for that column.

categorical_target_class_details

Optional [Union[list, int, str]]

None

A list denoting the order of classes in the target. This parameter is required in the following cases: - Binary classification tasks: If the target is of type string, you must tell Fiddler which class is considered the positive class for your output column. If you provide a single element, it is considered the positive class. Alternatively, you can provide a list with two elements. The 0th element by convention is considered the negative class, and the 1st element is considered the positive class. When your target is boolean, you don't need to specify this argument. By default Fiddler considers True as the positive class. In case your target is numerical, you don't need to specify this argument, by default Fiddler considers the higher of the two possible values as the positive class. - Multi-class classification tasks: You must tell Fiddler which class corresponds to which output by giving an ordered list of classes. This order should be the same as the order of the outputs. - Ranking tasks: If the target is of type string, you must provide a list of all the possible target values in the order of relevance. The first element will be considered as the least relevant grade and the last element from the list will be considered the most relevant grade. In the case your target is numerical, Fiddler considers the smallest value to be the least relevant grade and the biggest value from the list will be considered the most relevant grade.

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)
Return TypeDescription

fdl.ModelInfo

A fdl.ModelInfo() object constructed from the fdl.DatasetInfo() object provided.


fdl.ModelInfo.from_dict

Input ParametersTypeDefaultDescription

deserialized_json

dict

The dictionary object to be converted

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

model_info_dict = model_info.to_dict()

new_model_info = fdl.ModelInfo.from_dict(
    deserialized_json={
        'model': model_info_dict
    }
)
Return TypeDescription

fdl.ModelInfo

A fdl.ModelInfo() object constructed from the dictionary.


fdl.ModelInfo.to_dict

Return TypeDescription

dict

A dictionary containing information from the fdl.ModelInfo() object.

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

model_info_dict = model_info.to_dict()
{
    'name': 'Example Model',
    'input-type': 'structured',
    'model-task': 'binary_classification',
    'inputs': [
        {
            'column-name': 'feature_1',
            'data-type': 'float'
        },
        {
            'column-name': 'feature_2',
            'data-type': 'int'
        },
        {
            'column-name': 'feature_3',
            'data-type': 'bool'
        },
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'outputs': [
        {
            'column-name': 'output_column',
            'data-type': 'float'
        }
    ],
    'datasets': [],
    'targets': [
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'custom-explanation-names': []
}

Last updated

© 2024 Fiddler AI