Connecting to Fiddler

fdl.FiddlerAPI

The Client object is used to communicate with Fiddler. In order to use the client, you'll need to provide authentication details as shown below.

For more information, see Authorizing the Client.

ParameterTypeDefaultDescription
urlstrNoneThe URL used to connect to Fiddler
org_idstrNoneThe organization ID for a Fiddler instance. Can be found on the General tab of the Settings page.
auth_tokenstrNoneThe authorization token used to authenticate with Fiddler. Can be found on the Credentials tab of the Settings page.
proxiesOptional [dict]NoneA dictionary containing proxy URLs.
verboseOptional [bool]FalseIf True, client calls will be logged verbosely.
verifyOptional
[bool]
TrueIf False, client will allow self-signed SSL certificates from the Fiddler server environment. If True, the SSL certificates need to be signed by a certificate authority (CA).

🚧

Warning

If verbose is set to True, all information required for debugging will be logged, including the authorization token.

πŸ“˜

Info

To maximize compatibility, please ensure that your client version matches the server version for your Fiddler instance.

When you connect to Fiddler using the code on the right, you'll receive a notification if there is a version mismatch between the client and server.

You can install a specific version of fiddler-client using pip:
pip install fiddler-client==X.X.X

import fiddler as fdl

URL = 'https://app.fiddler.ai'
ORG_ID = 'my_org'
AUTH_TOKEN = 'p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58'

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN
)
import fiddler as fdl

URL = 'https://app.fiddler.ai'
ORG_ID = 'my_org'
AUTH_TOKEN = 'p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58'

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN, 
		verify=False
)
proxies = {
    'http' : 'http://proxy.example.com:1234',
    'https': 'https://proxy.example.com:5678'
}

client = fdl.FiddlerApi(
    url=URL,
    org_id=ORG_ID,
    auth_token=AUTH_TOKEN, 
		proxies=proxies
)

If you want to authenticate with Fiddler without passing this information directly into the function call, you can store it in a file named fiddler.ini, which should be stored in the same directory as your notebook or script.

%%writefile fiddler.ini

[FIDDLER]
url = https://app.fiddler.ai
org_id = my_org
auth_token = p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58
client = fdl.FiddlerApi()


Projects

Projects are used to organize your models and datasets. Each project can represent a machine learning task (e.g. predicting house prices, assessing creditworthiness, or detecting fraud).

A project can contain one or more models (e.g. lin_reg_house_predict, random_forest_house_predict).

For more information on projects, click here.


client.list_projects

response = client.list_projects()
Return TypeDescription
listA list containing the project ID string for each project
[
  'project_a',
  'project_b',
  'project_c'
]

client.create_project

Input ParametersTypeDefaultDescription
project_idstrNoneA unique identifier for the project. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.
PROJECT_ID = 'example_project'

client.create_project(
    project_id=PROJECT_ID
)
Return TypeDescription
dictA dictionary mapping project_name to the project ID string specified, once the project is successfully created.
{
    'project_name': 'example_project'
}

client.delete_project

Input ParametersTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
PROJECT_ID = 'example_project'

client.delete_project(
    project_id=PROJECT_ID
)
Return TypeDescription
boolA boolean denoting whether deletion was successful.
True

🚧

Caution

You cannot delete a project without deleting the datasets and the models associated with that project.



Datasets

Datasets (or baseline datasets) are used for making comparisons with production data.

A baseline dataset should be sampled from your model's training set, so it can serve as a representation of what the model expects to see in production.

For more information, see Uploading a Baseline Dataset.

For guidance on how to design a baseline dataset, see Designing a Baseline Dataset.


client.list_datasets

Input ParametersTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
PROJECT_ID = "example_project"

client.list_datasets(
    project_id=PROJECT_ID
)
Return TypeDescription
listA list containing the dataset ID strings for each project.
[
    'dataset_a',
    'dataset_b',
    'dataset_c'
]

client.upload_dataset

Input ParametersTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
datasetdictNoneA dictionary mapping dataset slice names to pandas DataFrames.
dataset_idstrNoneA unique identifier for the dataset. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.
infoOptional [fdl.DatasetInfo]NoneThe Fiddler fdl.DatasetInfo() object used to describe the dataset.
size_check_enabledOptional [bool]TrueIf True, will issue a warning when a dataset has a large number of rows.
import pandas as pd

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

client.upload_dataset(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    dataset={
        'baseline': df
    },
    info=dataset_info
)
Return TypeDescription
dictA dictionary containing information about the uploaded dataset.
{'uuid': '7046dda1-2779-4987-97b4-120e6185cc0b',
 'name': 'Ingestion dataset Upload',
 'info': {'project_name': 'example_model',
  'resource_name': 'acme_data',
  'resource_type': 'DATASET'},
 'status': 'SUCCESS',
 'progress': 100.0,
 'error_message': None,
 'error_reason': None}

client.delete_dataset

Input ParametersTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
dataset_idstrNoneA unique identifier for the dataset.
PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

client.delete_dataset(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)
Return TypeDescription
strA message confirming that the dataset was deleted.
'Dataset deleted example_dataset'

🚧

Caution

You cannot delete a dataset without deleting the models associated with that dataset first.


client.get_dataset_info

Input ParametersTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
dataset_idstrNoneA unique identifier for the dataset.
PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'

dataset_info = client.get_dataset_info(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)
Return TypeDescription
fdl.DatasetInfoThe fdl.DatasetInfo() object associated with the specified dataset.
#NA


Models

A model is a representation of your machine learning model. Each model must have an associated dataset to be used as a baseline for monitoring, explainability, and fairness capabilities.

You do not need to upload your model artifact in order to onboard your model, but doing so will significantly improve the quality of explanations generated by Fiddler.


client.add_model

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneA unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.
dataset_idstrNoneThe unique identifier for the dataset.
model_infofdl.ModelInfoNoneA fdl.ModelInfo() object containing information about the model.
PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

dataset_info = client.get_dataset_info(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID
)

model_task = fdl.ModelTask.BINARY_CLASSIFICATION
model_target = 'target_column'
model_output = 'output_column'
model_features = [
    'feature_1',
    'feature_2',
    'feature_3'
]

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    target=model_target,
    outputs=[model_output],
    model_task=model_task
)

client.add_model(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    model_info=model_info
)
Return TypeDescription
strA message confirming that the model was added.

client.add_model_artifact

πŸ“˜

Note

Before calling this function, you must have already added a model using add_model.

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneA unique identifier for the model.
model_dirstrNoneA path to the directory containing all of the model files needed to run the model.
deployment_paramsOptional[fdl.DeploymentParams]NoneDeployment parameters object for tuning the model deployment spec. Supported from server version 23.1 and above with Model Deployment feature enabled.
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.add_model_artifact(  
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir='model_dir/',
)

client.add_model_surrogate

πŸ“˜

Note

Before calling this function, you must have already added a model using add_model.

🚧

Surrogate models are not supported for input_type = fdl.ModelInputType.TEXT

Input ParameterTypeDefaultDescription
project_idstrNoneA unique identifier for the project.
model_idstrNoneA unique identifier for the model.
deployment_paramsOptional[fdl.DeploymentParams]NoneDeployment parameters object for tuning the model deployment spec.
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.add_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

# with deployment_params
client.add_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    deployment_params=fdl.DeploymentParams(cpu=250, memory=500)
)
Return TypeDescription
NoneReturns None

client.delete_model

For more information, see Uploading a Model Artifact.

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneA unique identifier for the model
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.delete_model(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

client.get_model_info

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneA unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

model_info = client.get_model_info(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)
Return TypeDescription
fdl.ModelInfoThe ModelInfo object associated with the specified model.

client.list_models

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
PROJECT_ID = 'example_project'

client.list_models(
    project_id=PROJECT_ID
)
Return TypeDescription
listA list containing the string ID of each model.
[
    'model_a',
    'model_b',
    'model_c'
]

client_register_model

❗️

Not supported with client 2.0 and above

Please use client.add_model() going forward.


client.trigger_pre_computation

❗️

Not supported with client 2.0 and above

This method is called automatically now when calling client.add_model_surrogate() or client.add_model_artifact().


client.update_model

For more information, see Uploading a Model Artifact.

🚧

Warning

This function does not allow for changes in a model's schema. The inputs and outputs to the model must remain the same.

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneA unique identifier for the model.
model_dirpathlib.PathNoneA path to the directory containing all of the model files needed to run the model.
force_pre_computeboolTrueIf True, re-run precomputation steps for the model. This can also be done manually by calling client.trigger_pre_computation.
import pathlib

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

model_dir = pathlib.Path('model_dir')

client.update_model(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir=model_dir
)
Return TypeDescription
boolA boolean denoting whether the update was successful.
True

client.update_model_artifact

πŸ“˜

Note

Before calling this function, you must have already added a model using add_model_surrogate or add_model_artifact

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneA unique identifier for the model.
model_dirstrNoneA path to the directory containing all of the model files needed to run the model.
deployment_paramsOptional[fdl.DeploymentParams]NoneDeployment parameters object for tuning the model deployment spec. Supported from server version 23.1 and above with Model Deployment feature enabled.
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.update_model_artifact(  
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    model_dir='model_dir/',
)

client.update_model_package

❗️

Not supported with client 2.0 and above

Please use client.add_model_artifact() going forward.


client.update_model_surrogate

πŸ“˜

Note

This method call cannot replace user uploaded model done using add_model_artifact. It can only re-generate a surrogate model

This can be used to re-generate a surrogate model for a model

Input ParameterTypeDefaultDescription
project_idstrNoneA unique identifier for the project.
model_idstrNoneA unique identifier for the model.
deployment_paramsOptional[fdl.DeploymentParams]NoneDeployment parameters object for tuning the model deployment spec.
waitOptional[bool]TrueWhether to wait for async job to finish(True) or return(False).
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

client.update_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

# with deployment_params
client.update_model_surrogate(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    deployment_params=fdl.DeploymentParams(cpu=250, memory=500)
)
Return TypeDescription
NoneReturns None


Model Deployment

client.get_model_deployment

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneThe unique identifier for the model.
PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'

client.get_model_deployment(
    project_id=PROJECT_NAME,
    model_id=MODEL_NAME,
)
Return TypeDescription
dictreturns a dictionary, with all related fields for the model deployment
{
  id: 106548,
  uuid: UUID("123e4567-e89b-12d3-a456-426614174000"),
  model_id: "MODEL_NAME",
  project_id : "PROJECT_NAME",
  organization_id: "ORGANIZATION_NAME",
  artifact_type: "PYTHON_PACKAGE",
  deployment_type: "BASE_CONTAINER",
  active: True,
  image_uri: "md-base/python/machine-learning:1.0.0",
  replicas: 1,
  cpu: 250,
  memory: 512,
  created_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "[email protected]",
  },
  updated_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "[email protected]",
  },
  created_at: datetime(2023, 1, 27, 10, 9, 39, 793829),
  updated_at: datetime(2023, 1, 30, 17, 3, 17, 813865),
  job_uuid: UUID("539j9630-a69b-98d5-g496-326117174805")
}

client.update_model_deployment

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneThe unique identifier for the model.
activeOptional [bool]NoneSet False to scale down model deployment and True to scale up.
replicasOptional[int]NoneThe number of replicas running the model.
cpuOptional [int]NoneThe amount of CPU (milli cpus) reserved per replica.
memoryOptional [int]NoneThe amount of memory (mebibytes) reserved per replica.
waitOptional[bool]TrueWhether to wait for the async job to finish (True) or not (False).

Example use cases

  • Horizontal scaling: horizontal scaling via replicas parameter. This will create multiple Kubernetes pods internally to handle requests.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    
    # Create 3 Kubernetes pods internally to handle requests
    client.update_model_deployment(
        project_id=PROJECT_NAME,
        model_id=MODEL_NAME,
        replicas=3,
    )
    
  • Vertical scaling: Model deployments support vertical scaling via cpu and memory parameters. Some models might need more memory to load the artifacts into memory or process the requests.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
      	model_id=MODEL_NAME,
        cpu=500,
        memory=1024,
    )
    
  • Scale down: You may want to scale down the model deployments to avoid allocating the resources when the model is not in use. Use active parameters to scale down the deployment.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
      	model_id=MODEL_NAME,
        active=False,
    )
    
  • Scale up: This will again create the model deployment Kubernetes pods with the resource values available in the database.

    PROJECT_NAME = 'example_project'
    MODEL_NAME = 'example_model'
    
    client.update_model_deployment(
        project_id=PROJECT_NAME,
      	model_id=MODEL_NAME,
        active=True,
    )
    
Return TypeDescription
dictreturns a dictionary, with all related fields for model deployment

Supported from server version 23.1 and above with Flexible Model Deployment feature enabled.

{
  id: 106548,
  uuid: UUID("123e4567-e89b-12d3-a456-426614174000"),
  model_id: "MODEL_NAME",
  project_id : "PROJECT_NAME",
  organization_id: "ORGANIZATION_NAME",
  artifact_type: "PYTHON_PACKAGE",
  deployment_type: "BASE_CONTAINER",
  active: True,
  image_uri: "md-base/python/machine-learning:1.0.0",
  replicas: 1,
  cpu: 250,
  memory: 512,
  created_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "[email protected]",
  },
  updated_by: {
    id: 4839,
    full_name: "first_name last_name",
    email: "[email protected]",
  },
  created_at: datetime(2023, 1, 27, 10, 9, 39, 793829),
  updated_at: datetime(2023, 1, 30, 17, 3, 17, 813865),
  job_uuid: UUID("539j9630-a69b-98d5-g496-326117174805")
}


Event Publication

Event publication is the process of sending your model's prediction logs, or events, to the Fiddler platform. Using the Fiddler Client, events can be published in batch or streaming mode. Using these events, Fiddler will calculate metrics around feature drift, prediction drift, and model performance. These events are also stored in Fiddler to allow for ad hoc segment analysis. Please read the sections that follow to learn more about how to use the Fiddler Client for event publication.


client.publish_event

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneA unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.
eventdictNoneA dictionary mapping field names to field values. Any fields found that are not present in the model's ModelInfo object will be dropped from the event.
event_idOptional [str]NoneA unique identifier for the event. If not specified, Fiddler will generate its own ID, which can be retrived using the get_slice API.
update_eventOptional [bool]NoneIf True, will only modify an existing event, referenced by event_id. If no event is found, no change will take place.
event_timestampOptional [int]NoneThe name of the timestamp input field for when the event took place. The format of this timestamp is given by timestamp_format. If no timestamp input is provided, the current time will be used.
timestamp_formatOptional [fdl.FiddlerTimestamp]fdl.FiddlerTimestamp.INFERThe format of the timestamp passed in event_timestamp. Can be one of
- fdl.FiddlerTimestamp.INFER
- fdl.FiddlerTimestamp.EPOCH_MILLISECONDS
- fdl.FiddlerTimestamp.EPOCH_SECONDS
- fdl.FiddlerTimestamp.ISO_8601
casting_typeOptional [bool]FalseIf True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.
dry_runOptional [bool]FalseIf True, the event will not be published, and instead a report will be generated with information about any problems with the event. Useful for debugging issues with event publishing.
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

example_event = {
    'feature_1': 20.7,
    'feature_2': 45000,
    'feature_3': True,
    'output_column': 0.79,
    'target_column': 1
}

client.publish_event(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    event=example_event,
    event_id='event_001',
    event_timestamp=1637344470000
)
Return TypeDescription
strreturns a string with a UUID acknowledging that the event was successfully received.
'66cfbeb6-5651-4e8b-893f-90286f435b8d'

client.publish_events_batch

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneA unique identifier for the model.
batch_sourceUnion[pd.Dataframe, str]NoneEither a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are
CSV (.csv)
Parquet (.pq)

- Pickled DataFrame (.pkl)
id_fieldOptional [str]NoneThe field containing event IDs for events in the batch. If not specified, Fiddler will generate its own ID, which can be retrived using the get_slice API.
update_eventOptional [bool]NoneIf True, will only modify an existing event, referenced by id_field. If an ID is provided for which there is no event, no change will take place.
timestamp_fieldOptional [str]NoneThe field containing timestamps for events in the batch. The format of these timestamps is given by timestamp_format. If no timestamp is provided for a given row, the current time will be used.
timestamp_formatOptional [fdl.FiddlerTimestamp]fdl.FiddlerTimestamp.INFERThe format of the timestamp passed in event_timestamp. Can be one of
-fdl.FiddlerTimestamp.INFER

- fdl.FiddlerTimestamp.EPOCH_MILLISECONDS
- fdl.FiddlerTimestamp.EPOCH_SECONDS
- fdl.FiddlerTimestamp.ISO_8601
data_sourceOptional [fdl.BatchPublishType]NoneThe location of the data source provided. By default, Fiddler will try to infer the value. Can be one of

- fdl.BatchPublishType.DATAFRAME
- fdl.BatchPublishType.LOCAL_DISK
- fdl.BatchPublishType.AWS_S3
casting_typeOptional [bool]FalseIf True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.
credentialsOptional [dict]NoneA dictionary containing authorization information for AWS or GCP.

For AWS, the expected keys are

- 'aws_access_key_id'
- 'aws_secret_access_key'
- 'aws_session_token'For GCP, the expected keys are

- 'gcs_access_key_id'
- 'gcs_secret_access_key'
- 'gcs_session_token'
group_byOptional [str]NoneThe field used to group events together when computing performance metrics (for ranking models only).
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

df_events = pd.read_csv('events.csv')

client.publish_events_batch(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        batch_source=df_events,
        timestamp_field='inference_date')
Return TypeDescription
dictA dictionary object which reports the result of the batch publication.
{'status': 202,
 'job_uuid': '4ae7bd3a-2b3f-4444-b288-d51e07b6736d',
 'files': ['ssoqj_tmpzmczjuob.csv'],
 'message': 'Successfully received the event data. Please allow time for the event ingestion to complete in the Fiddler platform.'}

client.publish_events_batch_schema

Input ParameterTypeDefaultDescription
batch_sourceUnion[pd.Dataframe, str]NoneEither a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are

- CSV (.csv)
publish_schemadictNoneA dictionary used for locating fields within complex or nested data structures.
data_sourceOptional [fdl.BatchPublishType]NoneThe location of the data source provided. By default, Fiddler will try to infer the value. Can be one of

- fdl.BatchPublishType.DATAFRAME
- fdl.BatchPublishType.LOCAL_DISK
- fdl.BatchPublishType.AWS_S3
credentialsOptional [dict]NoneA dictionary containing authorization information for AWS or GCP.

For AWS, the expected keys are

- 'aws_access_key_id'
- 'aws_secret_access_key'
- 'aws_session_token'For GCP, the expected keys are

- 'gcs_access_key_id'
- 'gcs_secret_access_key'
- 'gcs_session_token'
group_byOptional [str]NoneThe field used to group events together when computing performance metrics (for ranking models only).
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

path_to_batch = 'events_batch.avro'

schema = {
    '__static': {
        '__project': PROJECT_ID,
        '__model': MODEL_ID
    },
    '__dynamic': {
        'feature_1': 'features/feature_1',
        'feature_2': 'features/feature_2',
        'feature_3': 'features/feature_3',
        'output_column': 'outputs/output_column',
        'target_column': 'targets/target_column'
      ORG = '__org'
13      MODEL = '__model'
14      PROJECT = '__project'
15      TIMESTAMP = '__timestamp'
16      DEFAULT_TIMESTAMP = '__default_timestamp'
17      TIMESTAMP_FORMAT = '__timestamp_format'
18      EVENT_ID = '__event_id'
19      IS_UPDATE_EVENT = '__is_update_event'
20      STATUS = '__status'
21      LATENCY = '__latency'
22      ITERATOR_KEY = '__iterator_key'
    }
}

client.publish_events_batch_schema(
    batch_source=path_to_batch,
    publish_schema=schema
)
Return TypeDescription
dictA dictionary object which reports the result of the batch publication.
{'status': 202,
 'job_uuid': '5ae7bd3a-2b3f-4444-b288-d51e098a01d',
 'files': ['rroqj_tmpzmczjttb.csv'],
 'message': 'Successfully received the event data. Please allow time for the event ingestion to complete in the Fiddler platform.'}


Baselines

client.add_baseline

Input ParameterTypeRequiredDescription
project_idstringYesThe unique identifier for the project
model_idstringYesThe unique identifier for the model
baseline_idstringYesThe unique identifier for the baseline
typefdl.BaselineTypeYesone of :

PRE_PRODUCTION
STATIC_PRODUCTION
ROLLING_PRODUCTION
dataset_idstringNoTraining or validation dataset uploaded to Fiddler for a PRE_PRODUCTION baseline
start_timeintNoseconds since epoch to be used as the start time for STATIC_PRODUCTION baseline
end_timeintNoseconds since epoch to be used as the end time for STATIC_PRODUCTION baseline
offsetfdl.WindowSizeNooffset in seconds relative to the current time to be used for ROLLING_PRODUCTION baseline
window_sizefdl.WindowSizeNowidth of the window in seconds to be used for ROLLING_PRODUCTION baseline

Add a pre-production baseline

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_pre'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'


client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.PRE_PRODUCTION, 
  dataset_id=DATASET_NAME, 
)

Add a static production baseline

from datetime import datetime
from fiddler import BaselineType, WindowSize

start = datetime(2023, 1, 1, 0, 0) # 12 am, 1st Jan 2023
end = datetime(2023, 1, 2, 0, 0) # 12 am, 2nd Jan 2023

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_static'
DATASET_NAME = 'example_dataset'
MODEL_NAME = 'example_model'
START_TIME = start.timestamp()
END_TIME = end.timestamp()


client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.STATIC_PRODUCTION,
  start_time=START_TIME,
  end_time=END_TIME,
)

Add a rolling time window baseline

from fiddler import BaselineType, WindowSize

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_rolling'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'

client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.ROLLING_PRODUCTION,
  offset=WindowSize.ONE_MONTH, # How far back to set our window
  window_size=WindowSize.ONE_WEEK, # Size of the sliding window
)
Return TypeDescription
fdl.BaselineBaseline schema object with all the configuration parameters

client.get_baseline

get_baseline helps get the configuration parameters of the existing baseline

Input ParameterTypeRequiredDescription
project_idstringYesThe unique identifier for the project
model_idstringYesThe unique identifier for the model
baseline_idstringYesThe unique identifier for the baseline
PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'
BASELINE_NAME = 'example_preconfigured'


baseline = client.get_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
)
Return TypeDescription
fdl.BaselineBaseline schema object with all the configuration parameters

client.list_baselines

Gets all the baselines in a project or attached to a single model within a project

Input ParameterTypeRequiredDescription
project_idstringYesThe unique identifier for the project
model_idstringNoThe unique identifier for the model
PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'

# list baselines across all models within a project
client.list_baselines(
  project_id=ROJECT_NAME
)

# list baselines within a model
client.list_baselines(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
)
Return TypeDescription
List[fdl.Baseline]List of baseline config objects

client.delete_baseline

Deletes an existing baseline from a project

Input ParameterTypeRequiredDescription
project_idstringYesThe unique identifier for the project
model_idstringYesThe unique identifier for the model
baseline_idstringYesThe unique identifier for the baseline
PROJECT_NAME = 'example_project'
MODEL_NAME = 'example_model'
BASELINE_NAME = 'example_preconfigured'


client.delete_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
)


Monitoring

client.add_monitoring_config

Input ParametersTypeDefaultDescription
config_infodictNoneMonitoring config info for an entire org or a project or a model.
project_idOptional [str]NoneThe unique identifier for the project.
model_idOptional [str]NoneThe unique identifier for the model.

πŸ“˜

Info

add_monitoring_config can be applied at the model, project, or organization level.

  • If project_id and model_id are specified, the configuration will be applied at the model level.
  • If project_id is specified but model_id is not, the configuration will be applied at the project level.
  • If neither project_id nor model_id are specified, the configuration will be applied at the organization level.
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

monitoring_config = {
    'min_bin_value': 3600,
    'time_ranges': ['Day', 'Week', 'Month', 'Quarter', 'Year'],
    'default_time_range': 7200
}

client.add_monitoring_config(
    config_info=monitoring_config,
    project_id=PROJECT_ID,
    model_id=MODEL_ID
)

client.add_alert_rule

Input ParametersTypeDefaultDescription
namestrNoneA name for the alert rule
project_idstrNoneThe unique identifier for the project.
model_idstrNoneThe unique identifier for the model.
alert_typefdl.AlertTypeNoneOne of AlertType.PERFORMANCE,
AlertType.DATA_DRIFT,
AlertType.DATA_INTEGRITY, AlertType.SERVICE_METRICS, or
AlertType.STATISTIC
metricfdl.MetricNoneWhen alert_type is AlertType.SERVICE_METRICS this should be Metric.TRAFFIC.

When alert_type is AlertType.PERFORMANCE, choose one of the following based on the ML model task:

For binary_classfication:
Metric.ACCURACY
Metric.TPR
Metric.FPR
Metric.PRECISION
Metric.RECALL
Metric.F1_SCORE
Metric.ECE
Metric.AUC

For regression:
Metric.R2
Metric.MSE
Metric.MAE
Metric.MAPE
Metric.WMAPE

For multi-class classification:
Metric.ACCURACY
Metric.LOG_LOSS

For ranking:
Metric.MAP
Metric.MEAN_NDCG

When alert_type is AlertType.DATA_DRIFT choose one of the following:
Metric.PSI
Metric.JSD

When alert_type is AlertType.DATA_INTEGRITY choose one of the following:
Metric.RANGE_VIOLATION
Metric.MISSING_VALUE
Metric.TYPE_VIOLATION

When alert_type is AlertType.STATISTIC choose one of the following:
Metric.AVERAGE
Metric.SUM
Metric.FREQUENCY
bin_sizefdl.BinSizeONE_DAYDuration for which the metric value is calculated. Choose one of the following:
BinSize.ONE_HOUR
BinSize.ONE_DAY BinSize.SEVEN_DAYS
compare_tofdl.CompareToNoneWhether the metric value compared against a static value or the same bin from a previous time period.
CompareTo.RAW_VALUE CompareTo.TIME_PERIOD.
compare_periodfdl.ComparePeriodNoneRequired only when CompareTo is TIME_PERIOD. Choose one of the following: ComparePeriod.ONE_DAY
ComparePeriod.SEVEN_DAYS ComparePeriod.ONE_MONTH
ComparePeriod.THREE_MONTHS
priorityfdl.PriorityNonePriority.LOW
Priority.MEDIUM
Priority.HIGH
warning_thresholdfloatNone[Optional] Threshold value to crossing which a warning level severity alert will be triggered. This should be a decimal which represents a percentage (e.g. 0.45).
critical_thresholdfloat NoneThreshold value to crossing which a critical level severity alert will be triggered. This should be a decimal which represents a percentage (e.g. 0.45).
conditionfdl.AlertConditionNoneSpecifies if the rule should trigger if the metric is greater than or less than the thresholds. AlertCondition.LESSER
AlertCondition.GREATER
notifications_configDict[str, Dict[str, Any]]None[Optional] notifications config object created using helper method build_notifications_config()
columnsList[str]NoneColumn names on which alert rule is to be created.
Applicable only when alert_type is AlertType.DATA_INTEGRITY or AlertType.DRIFT. When alert type is AlertType.DATA_INTEGRITY, it can take *[***ANY***]* to check for all columns.
baseline_idstrNoneName of the baseline whose histogram is compared against the same derived from current data. When no baseline_id is specified then the default baseline is used.

Used only when alert type is AlertType.DATA_DRIFT.
segmentstrNoneThe segment to alert on. See Segments for more details.

πŸ“˜

Info

The Fiddler client can be used to create a variety of alert rules. Rules can be of Data Drift, Performance, Data Integrity, and Service Metrics types and they can be compared to absolute (compare_to = RAW_VALUE) or to relative values (compare_to = TIME_PERIOD).

# To add a Performance type alert rule which triggers an email notification 
# when precision metric is 5% higher than that from 1 hr bin one day ago.

import fiddler as fdl

notifications_config = client.build_notifications_config(
    emails = "[email protected], [email protected]",
)
client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.PERFORMANCE,
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

# To add Data Integrity alert rule which triggers an email notification when 
# published events have more than 5 null values in any 1 hour bin for the _age_ column. 
# Notice compare_to = fdl.CompareTo.RAW_VALUE.

import fiddler as fdl

client.add_alert_rule(
    name = "age-null-1hr",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_INTEGRITY,
    metric = fdl.Metric.MISSING_VALUE,
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.RAW_VALUE,
    priority = fdl.Priority.HIGH,
    warning_threshold = 5,
    critical_threshold = 10,
    condition = fdl.AlertCondition.GREATER,
    column = "age",
    notifications_config = notifications_config
)
# To add a Data Drift type alert rule which triggers an email notification 
# when PSI metric for 'age' column from an hr is 5% higher than that from 'baseline_name' dataset.

import fiddler as fdl

client.add_baseline(project_id='project-a', 
                    model_id='model-a', 
                    baseline_name='baseline_name', 
                    type=fdl.BaselineType.PRE_PRODUCTION, 
                    dataset_id='dataset-a')

notifications_config = client.build_notifications_config(
    emails = "[email protected], [email protected]",
)

client.add_alert_rule(
    name = "psi-gt-5prec-age-baseline_name",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_DRIFT,
    metric = fdl.Metric.PSI,
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.RAW_VALUE,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config,
    columns = ["age"],
    baseline_id = 'baseline_name'
)
# To add Drift type alert rule which triggers an email notification when 
# value of JSD metric is more than 0.5 for one hour bin for  _age_ or _gender_ columns. 
# Notice compare_to = fdl.CompareTo.RAW_VALUE.

import fiddler as fdl
notifications_config = client.build_notifications_config(
    emails = "[email protected], [email protected]",
)

client.add_alert_rule(
    name = "jsd_multi_col_1hr",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_DRIFT,
    metric = fdl.Metric.JSD,
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.RAW_VALUE,
    warning_threshold = 0.4,
    critical_threshold = 0.5,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config,
    columns = ["age", "gender"],
)
# To add Data Integrity alert rule which triggers an email notification when 
# published events have more than 5 percent null values in any 1 hour bin for the _age_ column. 

import fiddler as fdl

client.add_alert_rule(
    name = "age_null_percentage_greater_than_10",
    project_id = 'project-a',
    model_id = 'model-a',
    alert_type = fdl.AlertType.DATA_INTEGRITY,
    metric = 'null_violation_percentage',
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.RAW_VALUE,
    priority = fdl.Priority.HIGH,
    warning_threshold = 5,
    critical_threshold = 10,
    condition = fdl.AlertCondition.GREATER,
    column = "age",
    notifications_config = notifications_config
)
Return TypeDescription
Alert RuleCreated Alert Rule object

Example responses:

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE,
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]
AlertRule(alert_rule_uuid='e1aefdd5-ef22-4e81-b869-3964eff8b5cd', 
organization_name='some_org_name', 
project_id='project-a', 
model_id='model-a', 
name='age-null-1hr', 
alert_type=AlertType.DATA_INTEGRITY, 
metric=Metric.MISSING_VALUE, 
column='age', 
priority=Priority.HIGH, 
compare_to=CompareTo.RAW_VALUE, 
compare_period=None, 
warning_threshold=5, 
critical_threshold=10, 
condition=AlertCondition.GREATER,
bin_size=BinSize.ONE_HOUR)

AlertRule(alert_rule_uuid='e1aefdd5-ef22-4e81-b869-3964eff8b5cd', 
organization_name='some_org_name', 
project_id='project-a', 
model_id='model-a', 
name='psi-gt-5prec-age-baseline_name', 
alert_type=AlertType.DATA_DRIFT, 
metric=Metric.PSI, 
priority=Priority.HIGH, 
compare_to=CompareTo.RAW_VALUE, 
compare_period=None, 
warning_threshold=5, 
critical_threshold=10, 
condition=AlertCondition.GREATER,
bin_size=BinSize.ONE_HOUR,
columns=['age'],
baseline_id='baseline_name')
[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.DRIFT,
           metric=Metric.JSD,
           priority=Priority.HIGH,
           compare_to='CompareTo.RAW_VALUE,
           compare_period=ComparePeriod.ONE_HOUR,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.4,
           critical_threshold=0.5,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR,
           columns=['age', 'gender'])]

client.get_alert_rules

Input ParametersTypeDefaultDescription
project_idOptional [str]NoneA unique identifier for the project.
model_idOptional [str]NoneA unique identifier for the model.
alert_typeOptional[fdl.AlertType]NoneAlert type. One of: AlertType.PERFORMANCE, AlertType.DATA_DRIFT, AlertType.DATA_INTEGRITY, or AlertType.SERVICE_METRICS
metricOptional[fdl.Metric]NoneWhen alert_type is SERVICE_METRICS: Metric.TRAFFIC.

When alert_type is PERFORMANCE, choose one of the following based on the machine learning model.
1) For binary_classfication: One of
Metric.ACCURACY, Metric.TPR, Metric.FPR, Metric.PRECISION, Metric.RECALL, Metric.F1_SCORE, Metric.ECE, Metric.AUC
2) For Regression: One of
Metric.R2, Metric.MSE, Metric.MAE, Metric.MAPE, Metric.WMAPE
3) For Multi-class:
Metric.ACCURACY, Metric.LOG_LOSS
4) For Ranking:
Metric.MAP, Metric.MEAN_NDCG

When alert_type is DRIFT:
Metric.PSI or Metric.JSD

When alert_type is DATA_INTEGRITY:
One of
Metric.RANGE_VIOLATION,
Metric.MISSING_VALUE,
Metric.TYPE_VIOLATION
columnsOptional[List[str]]None [Optional] List of column names on which alert rule was created. Please note, Alert Rule matching any columns from this list will be returned.
offsetOptional[int]NonePointer to the starting of the page index
limitOptional[int]NoneNumber of records to be retrieved per page, also referred as page_size
orderingOptional[List[str]]NoneList of Alert Rule fields to order by. Eg. [β€˜critical_threshold’] or [β€˜- critical_threshold’] for descending order.

πŸ“˜

Info

The Fiddler client can be used to get a list of alert rules with respect to the filtering parameters.


import fiddler as fdl

alert_rules = client.get_alert_rules(
    project_id = 'project-a',
    model_id = 'model-a', 
    alert_type = fdl.AlertType.DATA_INTEGRITY, 
    metric = fdl.Metric.MISSING_VALUE,
    columns = ["age", "gender"], 
    ordering = ['critical_threshold'], #['-critical_threshold'] for descending
    limit= 4, ## to set number of rules to show in one go
    offset = 0, # page offset (multiple of limit)
)
Return TypeDescription
List[AlertRule]A List containing AlertRule objects returned by the query.

client.get_triggered_alerts

Input ParametersTypeDefaultDescription
alert_rule_uuidstrNoneThe unique system generated identifier for the alert rule.
start_timeOptional[datetime]7 days agoStart time to filter trigger alerts in yyyy-MM-dd format, inclusive.
end_timeOptional[datetime]todayEnd time to filter trigger alerts in yyyy-MM-dd format, inclusive.
offsetOptional[int]NonePointer to the starting of the page index
limitOptional[int]NoneNumber of records to be retrieved per page, also referred as page_size
orderingOptional[List[str]]NoneList of Alert Rule fields to order by. Eg. [β€˜alert_time_bucket’] or [β€˜- alert_time_bucket’] for descending order.

πŸ“˜

Info

The Fiddler client can be used to get a list of triggered alerts for given alert rule and time duration.


trigerred_alerts = client.get_triggered_alerts(
    alert_rule_uuid = "588744b2-5757-4ae9-9849-1f4e076a58de",
    start_time = "2022-05-01",
    end_time = "2022-09-30",
  	ordering = ['alert_time_bucket'], #['-alert_time_bucket'] for descending
    limit= 4, ## to set number of rules to show in one go
    offset = 0, # page offset
)
Return TypeDescription
List[TriggeredAlerts]A List containing TriggeredAlerts objects returned by the query.

client.delete_alert_rule

Input ParametersTypeDefaultDescription
alert_rule_uuidstrNoneThe unique system generated identifier for the alert rule.

πŸ“˜

Info

The Fiddler client can be used to get a list of triggered alerts for given alert rule and time duration.


client.delete_alert_rule(
    alert_rule_uuid = "588744b2-5757-4ae9-9849-1f4e076a58de",
)
Return TypeDescription
None

client.build_notifications_config

Input ParametersTypeDefaultDescription
emailsOptional[str]NoneComma separated emails list
pagerduty_servicesOptional[str]NoneComma separated pagerduty services list
pagerduty_severityOptional[str]NoneSeverity for the alerts triggered by pagerduty
webhooksOptional[List[str]]NoneComma separated valid uuids of webhooks available

πŸ“˜

Info

The Fiddler client can be used to build notification configuration to be used while creating alert rules.


notifications_config = client.build_notifications_config(
    emails = "[email protected]",
)

notifications_config = client.build_notifications_config(
  emails = "[email protected],[email protected]",
  pagetduty_services = 'pd_service_1',
  pagerduty_severity = 'critical'
)

notifications_config = client.build_notifications_config(
    webhooks = ["894d76e8-2268-4c2e-b1c7-5561da6f84ae", "3814b0ac-b8fe-4509-afc9-ae86c176ef13"]
)
Return TypeDescription
Dict[str, Dict[str, Any]]:dict with emails and pagerduty dict. If left unused, will store empty string for these values

Example Response:

{'emails': {'email': '[email protected]'}, 'pagerduty': {'service': '', 'severity': ''}, 'webhooks': []}

client.add_webhook

Input ParametersTypeDefaultDescription
namestrNoneA unique name for the webhook.
URLstrNoneThe webhook url is used for sending notification messages.
providerstrNoneThe platform provides webhooks functionality. Only β€˜SLACK’ is supported.

client.add_webhook(
        name='range_violation_channel',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
        provider='SLACK')
)
Return TypeDescription
fdl.WebhookDetails of the webhook created.

Example responses:

Webhook(uuid='df2397d3-23a8-4eb3-987a-2fe43b758b08',
        name='range_violation_channel', organization_name='some_org_name',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
        provider='SLACK')

πŸ“˜

Add Slack webhook

Use the Slack API reference to generate a webhook for your Slack App


client.delete_webhook

Input ParametersTypeDefaultDescription
uuidstrNoneThe unique system generated identifier for the webook.

client.delete_webhook(
    uuid = "ffcc2ddf-f896-41f0-bc50-4e7b76bb9ace",
)
Return TypeDescription
None

client.get_webhook

Input ParametersTypeDefaultDescription
uuidstrNoneThe unique system generated identifier for the webook.

client.get_webhook(
    alert_rule_uuid = "a5f085bc-6772-4eff-813a-bfc20ff71002",
)
Return TypeDescription
fdl.WebhookDetails of Webhook.

Example responses:

Webhook(uuid='a5f085bc-6772-4eff-813a-bfc20ff71002',
        name='binary_classification_alerts_channel',
        organization_name='some_org',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d,
        provider='SLACK')

client.get_webhooks

Input ParametersTypeDefaultDescription
limitOptional[int]300Number of records to be retrieved per page.
offsetOptional[int]0Pointer to the starting of the page index.
response = client.get_webhooks()
Return TypeDescription
List[fdl.Webhook]A List containing webhooks.

Example Response

[
  Webhook(uuid='e20bf4cc-d2cf-4540-baef-d96913b14f1b', name='model_1_alerts', organization_name='some_org', url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d', provider='SLACK'),
 	Webhook(uuid='bd4d02d7-d1da-44d7-b194-272b4351cff7', name='drift_alerts_channel', organization_name='some_org', url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d', provider='SLACK'),
 	Webhook(uuid='761da93b-bde2-4c1f-bb17-bae501abd511', name='project_1_alerts', organization_name='some_org', url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d', provider='SLACK')
]

client.update_webhook

Input ParametersTypeDefaultDescription
namestrNoneA unique name for the webhook.
urlstrNoneThe webhook url used for sending notification messages.
providerstrNoneThe platform that provides webhooks functionality. Only β€˜SLACK’ is supported.
uuidstrNoneThe unique system generated identifier for the webook.
client.update_webhook(uuid='e20bf4cc-d2cf-4540-baef-d96913b14f1b',
                      name='drift_violation',
                      url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
                      provider='SLACK')
Return TypeDescription
fdl.WebhookDetails of Webhook after modification.

Example Response:

Webhook(uuid='e20bf4cc-d2cf-4540-baef-d96913b14f1b',
        name='drift_violation', organization_name='some_org_name',
        url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
        provider='SLACK')

client.update_alert_notification_status

Input ParametersTypeDefaultDescription
notification_statusboolNoneThe status of notification for the alerts.
alert_config_idsOptional[List[str]]NoneList of Alert Ids that we want to update.
model_idOptional[str]NoneThe Model Id for which we want to update all alerts.

πŸ“˜

Info

The Fiddler client can be used to update the notification status of multiple alerts at once.


updated_alert_configs = client.update_alert_notification_status(
    notification_status = True,
    model_id = "9f8180d3-3fa0-40c4-8656-b9b1d2de1b69",
)
updated_alert_configs = client.update_alert_notification_status(
    notification_status = True,
    alert_config_ids = ["9b8711fa-735e-4a72-977c-c4c8b16543ae"],
)
Return TypeDescription
List[AlertRule]List of Alert Rules updated from this method.

Example responses:

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE,
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]


Custom Metrics

client.get_custom_metric

Input ParameterTypeRequiredDescription
metric_idstringYesThe unique identifier for the custom metric
METRIC_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

custom_metric = client.get_custom_metric(
  metric_id=METRIC_ID
)
Return TypeDescription
fiddler.schema.custom_metric.CustomMetricCustom metric object with details about the metric

client.get_custom_metrics

Input ParameterTypeDefaultRequiredDescription
project_idstringYesThe unique identifier for the project
model_idstringYesThe unique identifier for the model
limitOptional[int]300NoMaximum number of items to return
offsetOptional[int]0NoNumber of items to skip before returning
PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

custom_metrics = client.get_custom_metrics(
  project_id=PROJECT_ID,
  model_id=MODEL_ID
)
Return TypeDescription
List[fiddler.schema.custom_metric.CustomMetric]List of custom metric objects for the given model

client.add_custom_metric

For details on supported constants, operators, and functions, see Fiddler Query Language.

Input ParameterTypeRequiredDescription
namestringYesName of the custom metric
project_idstringYesThe unique identifier for the project
model_idstringYesThe unique identifier for the model
definitionstringYesThe FQL metric definition for the custom metric
descriptionstringNoA description of the custom metric
PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

definition = """
    average(if(Prediction < 0.5 and Target == 1, -40, if(Prediction >= 0.5 and Target == 0, -400, 250)))
"""

client.add_custom_metric(
    name='Loan Value',
    description='A custom value score assigned to a loan',
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    definition=definition
)

client.delete_custom_metric

Input ParameterTypeRequiredDescription
metric_idstringYesThe unique identifier for the custom metric
METRIC_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

client.delete_custom_metric(
  metric_id=METRIC_ID
)


Segments

client.get_segment

Input ParameterTypeRequiredDescription
segment_idstringYesThe unique identifier for the segment
SEGMENT_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

segment = client.get_segment(
  segment_id=SEGMENT_ID
)
Return TypeDescription
fdl.SegmentSegment object with details about the segment

client.get_segments

Input ParameterTypeDefaultRequiredDescription
project_idstringYesThe unique identifier for the project
model_idstringYesThe unique identifier for the model
limitOptional[int]300NoMaximum number of items to return
offsetOptional[int]0NoNumber of items to skip before returning
PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

custom_metrics = client.get_segments(
  project_id=PROJECT_ID,
  model_id=MODEL_ID
)
Return TypeDescription
List[fdl.Segment]List of segment objects for the given model

client.add_segment

For details on supported constants, operators, and functions, see Fiddler Query Language.

Input ParameterTypeRequiredDescription
namestringYesName of the segment
project_idstringYesThe unique identifier for the project
model_idstringYesThe unique identifier for the model
definitionstringYesThe FQL metric definition for the segment
descriptionstringNoA description of the segment
PROJECT_ID = 'my_project'
MODEL_ID = 'my_model'

definition = """
    age > 50
"""

client.add_segment(
    name='Over 50',
    description='All people over the age of 50',
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    definition=definition
)
Segment(
  id='50a1c32d-c2b4-4faf-9006-f4aeadd7a859',
  name='Over 50',
  project_name='my_project',
  organization_name='mainbuild',
  definition='age > 50',
  description='All people over the age of 50',
  created_at=None,
  created_by=None
)

client.delete_segment

Input ParameterTypeRequiredDescription
segment_idstringYesThe unique identifier for the segment
SEGMENT_ID = '7d06f905-80b1-4a41-9711-a153cbdda16c'

client.delete_segment(
  segment_id=SEGMENT_ID
)


Explainability

client.get_predictions

Input ParameterTypeDefaultDescription
project_idstrNoneA unique identifier for the project.
model_idstrNoneA unique identifier for the model.
input_dfpd.DataFrameNoneA pandas DataFrame containing model input vectors as rows.
chunk_sizeOptional[int]10000The chunk size for fetching predictions. Default is 10_000 rows chunk.
import pandas as pd

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

input_df = pd.read_csv('example_data.csv')

# Example without chunk size specified:
predictions = client.get_predictions(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_df=input_df,
)


# Example with chunk size specified:
predictions = client.get_predictions(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_df=input_df,
    chunk_size=1000,
)
Return TypeDescription
pd.DataFrameA pandas DataFrame containing model predictions for the given input vectors.

client.get_explanation

Input ParameterTypeDefaultDescription
project_idstrNoneA unique identifier for the project.
model_idstrNoneA unique identifier for the model.
input_data_sourceUnion[fdl.RowDataSource, fdl.EventIdDataSource]NoneType of data source for the input dataset to compute explanation on (RowDataSource, EventIdDataSource). A single row explanation is currently supported.
ref_data_sourceOptional[Union[fdl.DatasetDataSource, fdl.SqlSliceQueryDataSource] ]NoneType of data source for the reference data to compute explanation on (DatasetDataSource, SqlSliceQueryDataSource).
Only used for non-text models and the following methods:
'SHAP', 'FIDDLER_SHAP', 'PERMUTE', 'MEAN_RESET'
explanation_typeOptional[str]'FIDDLER_SHAP'Explanation method name. Could be your custom
explanation method or one of the following method:
'SHAP', 'FIDDLER_SHAP', 'IG', 'PERMUTE', 'MEAN_RESET', 'ZERO_RESET'
num_permutationsOptional[int]300- For Fiddler SHAP, num_permutations corresponds to the number of coalitions to sample to estimate the Shapley values of each single-reference game.
- For the permutation algorithms, num_permutations corresponds to the number of permutations from the dataset to use for the computation.
ci_levelOptional[float]0.95The confidence level (between 0 and 1).
top_n_classOptional[int]NoneFor multi-class classification models only, specifying if only the n top classes are computed or all classes (when parameter is None).
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset

# FIDDLER SHAP - Dataset reference data source
row = df.to_dict(orient='records')[0]
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    ref_data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=300),
    explanation_type='FIDDLER_SHAP',
    num_permutations=200,
    ci_level=0.95,
)

# FIDDLER SHAP - Slice ref data source
row = df.to_dict(orient='records')[0]
query = f'SELECT * from {DATASET_ID}.{MODEL_ID} WHERE sulphates >= 0.8'
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    ref_data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=100),
    explanation_type='FIDDLER_SHAP',
    num_permutations=200,
    ci_level=0.95,
)

# FIDDLER SHAP - Multi-class classification (top classes)
row = df.to_dict(orient='records')[0]
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    ref_data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID),
    explanation_type='FIDDLER_SHAP',
    top_n_class=2
)

# IG (Not available by default, need to be enabled via package.py)
row = df.to_dict(orient='records')[0]
client.get_explanation(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    input_data_source=fdl.RowDataSource(row=row),
    explanation_type='IG',
)
Return TypeDescription
tupleA named tuple with the explanation results.

client.get_feature_impact

Input ParameterTypeDefaultDescription
project_idstrNoneA unique identifier for the project.
model_idstrNoneA unique identifier for the model.
data_sourceUnion[fdl.DatasetDataSource, fdl.SqlSliceQueryDataSource]NoneType of data source for the input dataset to compute feature impact on (DatasetDataSource or SqlSliceQueryDataSource)
num_iterationsOptional[int]10000The maximum number of ablated model inferences per feature. Used for TABULAR data only.
num_refsOptional[int]10000Number of reference points used in the explanation. Used for TABULAR data only.
ci_levelOptional[float]0.95The confidence level (between 0 and 1). Used for TABULAR data only.
output_columnsOptional[List[str]]NoneOnly used for NLP (TEXT inputs) models. Output column names to compute feature impact on. Useful for Multi-class Classification models. If None, compute for all output columns.
min_supportOptional[int]15Only used for NLP (TEXT inputs) models. Specify a minimum support (number of times a specific word was present in the sample data) to retrieve top words. Default to 15.
overwrite_cacheOptional[bool]FalseWhether to overwrite the feature impact cached values or not.
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'

# Feature Impact for TABULAR data - Dataset Data Source
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Impact for TABULAR data - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=80),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Impact for TEXT data
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=50),
    output_columns= ['probability_A', 'probability_B'],
  	min_support=30
)
Return TypeDescription
tupleA named tuple with the feature impact results.

client.get_feature_importance

Input ParameterTypeDefaultDescription
project_idstrNoneA unique identifier for the project.
model_idstrNoneA unique identifier for the model.
data_sourceUnion[fdl.DatasetDataSource, fdl.SqlSliceQueryDataSource]NoneType of data source for the input dataset to compute feature importance on (DatasetDataSource or SqlSliceQueryDataSource)
num_iterationsOptional[int]10000The maximum number of ablated model inferences per feature.
num_refsOptional[int]10000Number of reference points used in the explanation.
ci_levelOptional[float]0.95The confidence level (between 0 and 1).
overwrite_cacheOptional[bool]FalseWhether to overwrite the feature importance cached values or not
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'


# Feature Importance - Dataset data source
feature_importance = client.get_feature_importance(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Importance - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
feature_importance = client.get_feature_importance(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=80),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)
Return TypeDescription
tupleA named tuple with the feature impact results.

client.get_mutual_information

Input ParameterTypeDefaultDescription
project_idstrNoneA unique identifier for the project.
dataset_idstrNoneA unique identifier for the dataset.
querystrNoneSlice query to compute Mutual information on.
column_namestrNoneColumn name to compute mutual information with respect to all the columns in the dataset.
normalizedOptional[bool]FalseIf set to True, it will compute Normalized Mutual Information.
num_samplesOptional[int]10000Number of samples to select for computation.
PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
mutual_info = client.get_mutual_information(
  project_id=PROJECT_ID,
  dataset_id=DATASET_ID,
  query=query,
  column_name='Geography',
  normalized=True,
  num_samples=20000,
)
Return TypeDescription
dictA dictionary with the mutual information results.


Analytics

client.get_slice

Input ParameterTypeDefaultDescription
sql_querystrNoneThe SQL query used to retrieve the slice.
project_idstrNoneThe unique identifier for the project. The model and/or the dataset to be queried within the project are designated in the sql_query itself.
columns_overrideOptional [list]NoneA list of columns to include in the slice, even if they aren't specified in the query.
import pandas as pd

PROJECT_ID = 'example_project'
DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

query = f""" SELECT * FROM "{DATASET_ID}.{MODEL_ID}" """

slice_df = client.get_slice(
    sql_query=query,
    project_id=PROJECT_ID
)
import pandas as pd

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

query = f""" SELECT * FROM "production.{MODEL_ID}" """

slice_df = client.get_slice(
    sql_query=query,
    project_id=PROJECT_ID
)
Return TypeDescription
pd.DataFrameA pandas DataFrame containing the slice returned by the query.

πŸ“˜

Info

Only read-only SQL operations are supported. Certain SQL operations like aggregations and joins might not result in a valid slice.



Fairness

client.get_fairness

🚧

Only Binary classification models with categorical protected attributes are currently supported.

Input ParameterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
model_idstrNoneThe unique identifier for the model.
data_sourceUnion[fdl.DatasetDataSource, fdl.SqlSliceQueryDataSource]NoneDataSource for the input dataset to compute fairness on (DatasetDataSource or SqlSliceQueryDataSource).
protected_featureslist[str]NoneA list of protected features.
positive_outcomeUnion[str, int, float, bool]NoneValue of the positive outcome (from the target column) for Fairness analysis.
score_thresholdOptional [float]0.5The score threshold used to calculate model outcomes.
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'

# Fairness - Dataset data source
fairness_metrics = client.get_fairness(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    protected_features=['feature_1', 'feature_2'],
    positive_outcome='Approved',
    score_threshold=0.6
)

# Fairness - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditSCore > 700'
fairness_metrics = client.get_fairness(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=200),
    protected_features=['feature_1', 'feature_2'],
    positive_outcome='Approved',
    score_threshold=0.6
)
Return TypeDescription
dictA dictionary containing fairness metric results.


Access Control

client.list_org_roles

🚧

Warning

Only administrators can use client.list_org_roles() .

client.list_org_roles()
Return TypeDescription
dictA dictionary of users and their roles in the organization.
{
    'members': [
        {
            'id': 1,
            'user': '[email protected]',
            'email': '[email protected]',
            'isLoggedIn': True,
            'firstName': 'Example',
            'lastName': 'Administrator',
            'imageUrl': None,
            'settings': {'notifyNews': True,
                'notifyAccount': True,
                'sliceTutorialCompleted': True},
            'role': 'ADMINISTRATOR'
        },
        {
            'id': 2,
            'user': '[email protected]',
            'email': '[email protected]',
            'isLoggedIn': True,
            'firstName': 'Example',
            'lastName': 'User',
            'imageUrl': None,
            'settings': {'notifyNews': True,
                'notifyAccount': True,
                'sliceTutorialCompleted': True},
            'role': 'MEMBER'
        }
    ],
    'invitations': [
        {
            'id': 3,
            'user': '[email protected]',
            'role': 'MEMBER',
            'invited': True,
            'link': 'http://app.fiddler.ai/signup/vSQWZkt3FP--pgzmuYe_-3-NNVuR58OLZalZOlvR0GY'
        }
    ]
}

client.list_project_roles

Input ParaemterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
PROJECT_ID = 'example_project'

client.list_project_roles(
    project_id=PROJECT_ID
)
Return TypeDescription
dictA dictionary of users and their roles for the specified project.
{
    'roles': [
        {
            'user': {
                'email': '[email protected]'
            },
            'team': None,
            'role': {
                'name': 'OWNER'
            }
        },
        {
            'user': {
                'email': '[email protected]'
            },
            'team': None,
            'role': {
                'name': 'READ'
            }
        }
    ]
}

client.list_teams

client.list_teams()
Return TypeDescription
dictA dictionary containing information about teams and users.
{
    'example_team': {
        'members': [
            {
                'user': '[email protected]',
                'role': 'MEMBER'
            },
            {
                'user': '[email protected]',
                'role': 'MEMBER'
            }
        ]
    }
}

client.share_project

πŸ“˜

Info

Administrators can share any project with any user. If you lack the required permissions to share a project, contact your organization administrator.

Input ParaemterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
rolestrNoneThe permissions role being shared. Can be one of
- 'READ'
- 'WRITE'
- 'OWNER'
user_nameOptional [str]NoneA username with which the project will be shared. Typically an email address.
team_nameOptional [str]NoneA team with which the project will be shared.
PROJECT_ID = 'example_project'

client.share_project(
    project_name=PROJECT_ID,
    role='READ',
    user_name='[email protected]'
)

client.unshare_project

πŸ“˜

Info

Administrators and project owners can unshare any project with any user. If you lack the required permissions to unshare a project, contact your organization administrator.

Input ParaemterTypeDefaultDescription
project_idstrNoneThe unique identifier for the project.
rolestrNoneThe permissions role being revoked. Can be one of
- 'READ'
- 'WRITE'
- 'OWNER'
user_nameOptional [str]NoneA username with which the project will be revoked. Typically an email address.
team_nameOptional [str]NoneA team with which the project will be revoked.
PROJECT_ID = 'example_project'

client.unshare_project(
    project_name=PROJECT_ID,
    role='READ',
    user_name='[email protected]'
)


Fiddler Objects

fdl.DatasetInfo

For information on how to customize these objects, see Customizing Your Dataset Schema.

Input ParametersTypeDefaultDescription
display_namestrNoneA display name for the dataset.
columnslistNoneA list of fdl.Column objects containing information about the columns.
filesOptional [list]NoneA list of strings pointing to CSV files to use.
dataset_idOptional [str]NoneThe unique identifier for the dataset
**kwargsAdditional arguments to be passed.
columns = [
    fdl.Column(
        name='feature_1',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='feature_2',
        data_type=fdl.DataType.INTEGER
    ),
    fdl.Column(
        name='feature_3',
        data_type=fdl.DataType.BOOLEAN
    ),
    fdl.Column(
        name='output_column',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='target_column',
        data_type=fdl.DataType.INTEGER
    )
]

dataset_info = fdl.DatasetInfo(
    display_name='Example Dataset',
    columns=columns
)

fdl.DatasetInfo.from_dataframe

Input ParametersTypeDefaultDescription
dfUnion [pd.Dataframe, list]Either a single pandas DataFrame or a list of DataFrames. If a list is given, all dataframes must have the same columns.
display_namestr' 'A display_name for the dataset
max_inferred_cardinalityOptional [int]100If specified, any string column containing fewer than max_inferred_cardinality unique values will be converted to a categorical data type.
dataset_idOptional [str]NoneThe unique identifier for the dataset
import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)
Return TypeDescription
fdl.DatasetInfoA fdl.DatasetInfo() object constructed from the pandas Dataframe provided.

fdl.DatasetInfo.from_dict

Input ParametersTypeDefaultDescription
deserialized_jsondictThe dictionary object to be converted
import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

dataset_info_dict = dataset_info.to_dict()

new_dataset_info = fdl.DatasetInfo.from_dict(
    deserialized_json={
        'dataset': dataset_info_dict
    }
)
Return TypeDescription
fdl.DatasetInfoA fdl.DatasetInfo() object constructed from the dictionary.

fdl.DatasetInfo.to_dict

Return TypeDescription
dictA dictionary containing information from the fdl.DatasetInfo() object.
import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

dataset_info_dict = dataset_info.to_dict()
{
    'name': 'Example Dataset',
    'columns': [
        {
            'column-name': 'feature_1',
            'data-type': 'float'
        },
        {
            'column-name': 'feature_2',
            'data-type': 'int'
        },
        {
            'column-name': 'feature_3',
            'data-type': 'bool'
        },
        {
            'column-name': 'output_column',
            'data-type': 'float'
        },
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'files': []
}

fdl.ModelInfo

Input ParametersTypeDefaultDescription
display_namestrA display name for the model.
input_typefdl.ModelInputTypeA ModelInputType object containing the input type of the model.
model_taskfdl.ModelTaskA ModelTask object containing the model task.
inputslistA list of Column objects corresponding to the inputs (features) of the model.
outputslistA list of Column objects corresponding to the outputs (predictions) of the model.
metadataOptional [list]NoneA list of Column objects corresponding to any metadata fields.
decisionsOptional [list]NoneA list of Column objects corresponding to any decision fields (post-prediction business decisions).
targetsOptional [list]NoneA list of Column objects corresponding to the targets (ground truth) of the model.
frameworkOptional [str]NoneA string providing information about the software library and version used to train and run this model.
descriptionOptional [str]NoneA description of the model.
datasetsOptional [list]NoneA list of the dataset IDs used by the model.
mlflow_paramsOptional [fdl.MLFlowParams]NoneA MLFlowParams object containing information about MLFlow parameters.
model_deployment_paramsOptional [fdl.ModelDeploymentParams]NoneA ModelDeploymentParams object containing information about model deployment.
artifact_statusOptional [fdl.ArtifactStatus]NoneAn ArtifactStatus object containing information about the model artifact.
preferred_explanation_methodOptional [fdl.ExplanationMethod]NoneAn ExplanationMethod object that specifies the default explanation algorithm to use for the model.
custom_explanation_namesOptional [list][ ]A list of names that can be passed to the explanation_name _argument of the optional user-defined _explain_custom method of the model object defined in package.py.
binary_classification_thresholdOptional [float].5The threshold used for classifying inferences for binary classifiers.
ranking_top_kOptional [int]50Used only for ranking models. Sets the top k results to take into consideration when computing performance metrics like MAP and NDCG.
group_byOptional [str]NoneUsed only for ranking models. The column by which to group events for certain performance metrics like MAP and NDCG.
fall_backOptional [dict]NoneA dictionary mapping a column name to custom missing value encodings for that column.
target_class_orderOptional [list]NoneA list denoting the order of classes in the target. This parameter is required in the following cases:

- Binary classification tasks: If the target is of type string, you must tell Fiddler which class is considered the positive class for your output column. You need to provide a list with two elements. The 0th element by convention is considered the negative class, and the 1st element is considered the positive class. When your target is boolean, you don't need to specify this argument. By default Fiddler considers True as the positive class. In case your target is numerical, you don't need to specify this argument, by default Fiddler considers the higher of the two possible values as the positive class.

- Multi-class classification tasks: You must tell Fiddler which class corresponds to which output by giving an ordered list of classes. This order should be the same as the order of the outputs.

- Ranking tasks: If the target is of type string, you must provide a list of all the possible target values in the order of relevance. The first element will be considered as the least relevant grade and the last element from the list will be considered the most relevant grade.
In the case your target is numerical, Fiddler considers the smallest value to be the least relevant grade and the biggest value from the list will be considered the most relevant grade.
**kwargsAdditional arguments to be passed.
inputs = [
    fdl.Column(
        name='feature_1',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='feature_2',
        data_type=fdl.DataType.INTEGER
    ),
    fdl.Column(
        name='feature_3',
        data_type=fdl.DataType.BOOLEAN
    )
]

outputs = [
    fdl.Column(
        name='output_column',
        data_type=fdl.DataType.FLOAT
    )
]

targets = [
    fdl.Column(
        name='target_column',
        data_type=fdl.DataType.INTEGER
    )
]

model_info = fdl.ModelInfo(
    display_name='Example Model',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION,
    inputs=inputs,
    outputs=outputs,
    targets=targets
)

fdl.ModelInfo.from_dataset_info

Input ParametersTypeDefaultDescription
dataset_infofdl.DatasetInfo()The DatasetInfo object from which to construct the ModelInfo object.
targetstrThe column to be used as the target (ground truth).
model_taskfdl.ModelTaskNoneA ModelTask object containing the model task.
dataset_idOptional [str]NoneThe unique identifier for the dataset.
featuresOptional [list]NoneA list of columns to be used as features.
custom_featuresOptional[List[CustomFeature]]NoneList of Custom Features definitions for a model. Objects of type Multivariate, Vector, ImageEmbedding or TextEmbedding derived from CustomFeature can be provided.
metadata_colsOptional [list]NoneA list of columns to be used as metadata fields.
decision_colsOptional [list]NoneA list of columns to be used as decision fields.
display_nameOptional [str]NoneA display name for the model.
descriptionOptional [str]NoneA description of the model.
input_typeOptional [fdl.ModelInputType]fdl.ModelInputType.TABULARA ModelInputType object containing the input type of the model.
outputsOptional [list]A list of Column objects corresponding to the outputs (predictions) of the model.
targetsOptional [list]NoneA list of Column objects corresponding to the targets (ground truth) of the model.
model_deployment_paramsOptional [fdl.ModelDeploymentParams]NoneA ModelDeploymentParams object containing information about model deployment.
frameworkOptional [str]NoneA string providing information about the software library and version used to train and run this model.
datasetsOptional [list]NoneA list of the dataset IDs used by the model.
mlflow_paramsOptional [fdl.MLFlowParams]NoneA MLFlowParams object containing information about MLFlow parameters.
preferred_explanation_methodOptional [fdl.ExplanationMethod]NoneAn ExplanationMethod object that specifies the default explanation algorithm to use for the model.
custom_explanation_namesOptional [list][ ]A list of names that can be passed to the explanation_name _argument of the optional user-defined _explain_custom method of the model object defined in package.py.
binary_classification_thresholdOptional [float].5The threshold used for classifying inferences for binary classifiers.
ranking_top_kOptional [int]50Used only for ranking models. Sets the top k results to take into consideration when computing performance metrics like MAP and NDCG.
group_byOptional [str]NoneUsed only for ranking models. The column by which to group events for certain performance metrics like MAP and NDCG.
fall_backOptional [dict]NoneA dictionary mapping a column name to custom missing value encodings for that column.
categorical_target_class_detailsOptional [Union[list, int, str]]NoneA list denoting the order of classes in the target. This parameter is required in the following cases:

- Binary classification tasks: If the target is of type string, you must tell Fiddler which class is considered the positive class for your output column. If you provide a single element, it is considered the positive class. Alternatively, you can provide a list with two elements. The 0th element by convention is considered the negative class, and the 1st element is considered the positive class. When your target is boolean, you don't need to specify this argument. By default Fiddler considers True as the positive class. In case your target is numerical, you don't need to specify this argument, by default Fiddler considers the higher of the two possible values as the positive class.

- Multi-class classification tasks: You must tell Fiddler which class corresponds to which output by giving an ordered list of classes. This order should be the same as the order of the outputs.

- Ranking tasks: If the target is of type string, you must provide a list of all the possible target values in the order of relevance. The first element will be considered as the least relevant grade and the last element from the list will be considered the most relevant grade.
In the case your target is numerical, Fiddler considers the smallest value to be the least relevant grade and the biggest value from the list will be considered the most relevant grade.
import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)
Return TypeDescription
fdl.ModelInfoA fdl.ModelInfo() object constructed from the fdl.DatasetInfo() object provided.

fdl.ModelInfo.from_dict

Input ParametersTypeDefaultDescription
deserialized_jsondictThe dictionary object to be converted
import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

model_info_dict = model_info.to_dict()

new_model_info = fdl.ModelInfo.from_dict(
    deserialized_json={
        'model': model_info_dict
    }
)
Return TypeDescription
fdl.ModelInfoA fdl.ModelInfo() object constructed from the dictionary.

fdl.ModelInfo.to_dict

Return TypeDescription
dictA dictionary containing information from the fdl.ModelInfo() object.
import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

model_info_dict = model_info.to_dict()
{
    'name': 'Example Model',
    'input-type': 'structured',
    'model-task': 'binary_classification',
    'inputs': [
        {
            'column-name': 'feature_1',
            'data-type': 'float'
        },
        {
            'column-name': 'feature_2',
            'data-type': 'int'
        },
        {
            'column-name': 'feature_3',
            'data-type': 'bool'
        },
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'outputs': [
        {
            'column-name': 'output_column',
            'data-type': 'float'
        }
    ],
    'datasets': [],
    'targets': [
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'custom-explanation-names': []
}