fdl.DatasetInfo

For information on how to customize these objects, see Customizing Your Dataset Schema.

Input Parameters	Type	Default	Description
display_name	str	None	A display name for the dataset.
columns	list	None	A list of fdl.Column objects containing information about the columns.
files	Optional [list]	None	A list of strings pointing to CSV files to use.
dataset_id	Optional [str]	None	The unique identifier for the dataset
**kwargs			Additional arguments to be passed.

columns = [
    fdl.Column(
        name='feature_1',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='feature_2',
        data_type=fdl.DataType.INTEGER
    ),
    fdl.Column(
        name='feature_3',
        data_type=fdl.DataType.BOOLEAN
    ),
    fdl.Column(
        name='output_column',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='target_column',
        data_type=fdl.DataType.INTEGER
    )
]

dataset_info = fdl.DatasetInfo(
    display_name='Example Dataset',
    columns=columns
)

fdl.DatasetInfo.from_dataframe

Input Parameters	Type	Default	Description
df	Union [pd.Dataframe, list]		Either a single pandas DataFrame or a list of DataFrames. If a list is given, all dataframes must have the same columns.
display_name	str	' '	A display_name for the dataset
max_inferred_cardinality	Optional [int]	100	If specified, any string column containing fewer than max_inferred_cardinality unique values will be converted to a categorical data type.
dataset_id	Optional [str]	None	The unique identifier for the dataset

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

Return Type	Description
fdl.DatasetInfo	A fdl.DatasetInfo() object constructed from the pandas Dataframe provided.

fdl.DatasetInfo.from_dict

Input Parameters	Type	Default	Description
deserialized_json	dict		The dictionary object to be converted

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

dataset_info_dict = dataset_info.to_dict()

new_dataset_info = fdl.DatasetInfo.from_dict(
    deserialized_json={
        'dataset': dataset_info_dict
    }
)

Return Type	Description
fdl.DatasetInfo	A fdl.DatasetInfo() object constructed from the dictionary.

fdl.DatasetInfo.to_dict

Return Type	Description
dict	A dictionary containing information from the fdl.DatasetInfo() object.

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(df=df, max_inferred_cardinality=100)

dataset_info_dict = dataset_info.to_dict()

{
    'name': 'Example Dataset',
    'columns': [
        {
            'column-name': 'feature_1',
            'data-type': 'float'
        },
        {
            'column-name': 'feature_2',
            'data-type': 'int'
        },
        {
            'column-name': 'feature_3',
            'data-type': 'bool'
        },
        {
            'column-name': 'output_column',
            'data-type': 'float'
        },
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'files': []
}

fdl.ModelInfo

Input Parameters	Type	Default	Description
display_name	str		A display name for the model.
input_type	fdl.ModelInputType		A ModelInputType object containing the input type of the model.
model_task	fdl.ModelTask		A ModelTask object containing the model task.
inputs	list		A list of Column objects corresponding to the inputs (features) of the model.
outputs	list		A list of Column objects corresponding to the outputs (predictions) of the model.
metadata	Optional [list]	None	A list of Column objects corresponding to any metadata fields.
decisions	Optional [list]	None	A list of Column objects corresponding to any decision fields (post-prediction business decisions).
targets	Optional [list]	None	A list of Column objects corresponding to the targets (ground truth) of the model.
framework	Optional [str]	None	A string providing information about the software library and version used to train and run this model.
description	Optional [str]	None	A description of the model.
datasets	Optional [list]	None	A list of the dataset IDs used by the model.
mlflow_params	Optional [fdl.MLFlowParams]	None	A MLFlowParams object containing information about MLFlow parameters.
model_deployment_params	Optional [fdl.ModelDeploymentParams]	None	A ModelDeploymentParams object containing information about model deployment.
artifact_status	Optional [fdl.ArtifactStatus]	None	An ArtifactStatus object containing information about the model artifact.
preferred_explanation_method	Optional [fdl.ExplanationMethod]	None	An ExplanationMethod object that specifies the default explanation algorithm to use for the model.
custom_explanation_names	Optional [list]	[ ]	A list of names that can be passed to the explanation_name _argument of the optional user-defined _explain_custom method of the model object defined in package.py.
binary_classification_threshold	Optional [float]	.5	The threshold used for classifying inferences for binary classifiers.
ranking_top_k	Optional [int]	50	Used only for ranking models. Sets the top k results to take into consideration when computing performance metrics like MAP and NDCG.
group_by	Optional [str]	None	Used only for ranking models. The column by which to group events for certain performance metrics like MAP and NDCG.
fall_back	Optional [dict]	None	A dictionary mapping a column name to custom missing value encodings for that column.
target_class_order	Optional [list]	None	A list denoting the order of classes in the target. This parameter is required in the following cases: - Binary classification tasks: If the target is of type string, you must tell Fiddler which class is considered the positive class for your output column. You need to provide a list with two elements. The 0th element by convention is considered the negative class, and the 1st element is considered the positive class. When your target is boolean, you don't need to specify this argument. By default Fiddler considers `True` as the positive class. In case your target is numerical, you don't need to specify this argument, by default Fiddler considers the higher of the two possible values as the positive class. - Multi-class classification tasks: You must tell Fiddler which class corresponds to which output by giving an ordered list of classes. This order should be the same as the order of the outputs. - Ranking tasks: If the target is of type string, you must provide a list of all the possible target values in the order of relevance. The first element will be considered as the least relevant grade and the last element from the list will be considered the most relevant grade. In the case your target is numerical, Fiddler considers the smallest value to be the least relevant grade and the biggest value from the list will be considered the most relevant grade.
**kwargs			Additional arguments to be passed.

inputs = [
    fdl.Column(
        name='feature_1',
        data_type=fdl.DataType.FLOAT
    ),
    fdl.Column(
        name='feature_2',
        data_type=fdl.DataType.INTEGER
    ),
    fdl.Column(
        name='feature_3',
        data_type=fdl.DataType.BOOLEAN
    )
]

outputs = [
    fdl.Column(
        name='output_column',
        data_type=fdl.DataType.FLOAT
    )
]

targets = [
    fdl.Column(
        name='target_column',
        data_type=fdl.DataType.INTEGER
    )
]

model_info = fdl.ModelInfo(
    display_name='Example Model',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION,
    inputs=inputs,
    outputs=outputs,
    targets=targets
)

fdl.ModelInfo.from_dataset_info

Input Parameters	Type	Default	Description
dataset_info	fdl.DatasetInfo()		The DatasetInfo object from which to construct the ModelInfo object.
target	str		The column to be used as the target (ground truth).
model_task	fdl.ModelTask	None	A ModelTask object containing the model task.
dataset_id	Optional [str]	None	The unique identifier for the dataset.
features	Optional [list]	None	A list of columns to be used as features.
custom_features	Optional[List[CustomFeature]]	None	List of Custom Features definitions for a model. Objects of type Multivariate, Vector, ImageEmbedding or TextEmbedding derived from CustomFeature can be provided.
metadata_cols	Optional [list]	None	A list of columns to be used as metadata fields.
decision_cols	Optional [list]	None	A list of columns to be used as decision fields.
display_name	Optional [str]	None	A display name for the model.
description	Optional [str]	None	A description of the model.
input_type	Optional [fdl.ModelInputType]	fdl.ModelInputType.TABULAR	A ModelInputType object containing the input type of the model.
outputs	Optional [list]		A list of Column objects corresponding to the outputs (predictions) of the model.
targets	Optional [list]	None	A list of Column objects corresponding to the targets (ground truth) of the model.
model_deployment_params	Optional [fdl.ModelDeploymentParams]	None	A ModelDeploymentParams object containing information about model deployment.
framework	Optional [str]	None	A string providing information about the software library and version used to train and run this model.
datasets	Optional [list]	None	A list of the dataset IDs used by the model.
mlflow_params	Optional [fdl.MLFlowParams]	None	A MLFlowParams object containing information about MLFlow parameters.
preferred_explanation_method	Optional [fdl.ExplanationMethod]	None	An ExplanationMethod object that specifies the default explanation algorithm to use for the model.
custom_explanation_names	Optional [list]	[ ]	A list of names that can be passed to the explanation_name _argument of the optional user-defined _explain_custom method of the model object defined in package.py.
binary_classification_threshold	Optional [float]	.5	The threshold used for classifying inferences for binary classifiers.
ranking_top_k	Optional [int]	50	Used only for ranking models. Sets the top k results to take into consideration when computing performance metrics like MAP and NDCG.
group_by	Optional [str]	None	Used only for ranking models. The column by which to group events for certain performance metrics like MAP and NDCG.
fall_back	Optional [dict]	None	A dictionary mapping a column name to custom missing value encodings for that column.
categorical_target_class_details	Optional [Union[list, int, str]]	None	A list denoting the order of classes in the target. This parameter is required in the following cases: - Binary classification tasks: If the target is of type string, you must tell Fiddler which class is considered the positive class for your output column. If you provide a single element, it is considered the positive class. Alternatively, you can provide a list with two elements. The 0th element by convention is considered the negative class, and the 1st element is considered the positive class. When your target is boolean, you don't need to specify this argument. By default Fiddler considers `True` as the positive class. In case your target is numerical, you don't need to specify this argument, by default Fiddler considers the higher of the two possible values as the positive class. - Multi-class classification tasks: You must tell Fiddler which class corresponds to which output by giving an ordered list of classes. This order should be the same as the order of the outputs. - Ranking tasks: If the target is of type string, you must provide a list of all the possible target values in the order of relevance. The first element will be considered as the least relevant grade and the last element from the list will be considered the most relevant grade. In the case your target is numerical, Fiddler considers the smallest value to be the least relevant grade and the biggest value from the list will be considered the most relevant grade.

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

Return Type	Description
fdl.ModelInfo	A fdl.ModelInfo() object constructed from the fdl.DatasetInfo() object provided.

fdl.ModelInfo.from_dict

Input Parameters	Type	Default	Description
deserialized_json	dict		The dictionary object to be converted

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

model_info_dict = model_info.to_dict()

new_model_info = fdl.ModelInfo.from_dict(
    deserialized_json={
        'model': model_info_dict
    }
)

Return Type	Description
fdl.ModelInfo	A fdl.ModelInfo() object constructed from the dictionary.

fdl.ModelInfo.to_dict

Return Type	Description
dict	A dictionary containing information from the fdl.ModelInfo() object.

import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

model_info_dict = model_info.to_dict()

{
    'name': 'Example Model',
    'input-type': 'structured',
    'model-task': 'binary_classification',
    'inputs': [
        {
            'column-name': 'feature_1',
            'data-type': 'float'
        },
        {
            'column-name': 'feature_2',
            'data-type': 'int'
        },
        {
            'column-name': 'feature_3',
            'data-type': 'bool'
        },
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'outputs': [
        {
            'column-name': 'output_column',
            'data-type': 'float'
        }
    ],
    'datasets': [],
    'targets': [
        {
            'column-name': 'target_column',
            'data-type': 'int'
        }
    ],
    'custom-explanation-names': []
}

fdl.WeightingParams

Holds weighting information for class imbalanced models which can then be passed into a fdl.ModelInfo object. Please note that the use of weighting params requires the presence of model outputs in the baseline dataset.

Input Parameters	Type	Default	Description
class_weight	List[float]	None	List of floats representing weights for each of the classes. The length must equal the no. of classes.
weighted_reference_histograms	bool	True	Flag indicating if baseline histograms must be weighted or not when calculating drift metrics.
weighted_surrogate_training	bool	True	Flag indicating if weighting scheme should be used when training the surrogate model.

import pandas as pd
import sklearn.utils
import fiddler as fdl

df = pd.read_csv('example_dataset.csv')
computed_weight = sklearn.utils.class_weight.compute_class_weight(
        class_weight='balanced',
        classes=np.unique(df[TARGET_COLUMN]),
        y=df[TARGET_COLUMN]
    ).tolist()
weighting_params =  fdl.WeightingParams(class_weight=computed_weight)
dataset_info = fdl.DatasetInfo.from_dataframe(df=df)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=['output_column'],
    target='target_column',
    weighting_params=weighting_params,
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)

fdl.ModelInputType

Enum Value	Description
fdl.ModelInputType.TABULAR	For tabular models.
fdl.ModelInputType.TEXT	For text models.

model_input_type = fdl.ModelInputType.TABULAR

fdl.ModelTask

Represents supported model tasks

Enum Value	Description
fdl.ModelTask.REGRESSION	For regression models.
fdl.ModelTask.BINARY_CLASSIFICATION	For binary classification models
fdl.ModelTask.MULTICLASS_CLASSIFICATION	For multiclass classification models
fdl.ModelTask.RANKING	For ranking classification models
fdl.ModelTask.LLM	For LLM models.
fdl.ModelTask.NOT_SET	For other model tasks or no model task specified.

model_task = fdl.ModelTask.BINARY_CLASSIFICATION

fdl.DataType

Represents supported data types.

Enum Value	Description
fdl.DataType.FLOAT	For floats.
fdl.DataType.INTEGER	For integers.
fdl.DataType.BOOLEAN	For booleans.
fdl.DataType.STRING	For strings.
fdl.DataType.CATEGORY	For categorical types.
fdl.DataType.VECTOR	For vector types

data_type = fdl.DataType.FLOAT

fdl.Column

Represents a column of a dataset.

Input Parameter	Type	Default	Description
name	str	None	The name of the column
data_type	fdl.DataType	None	The fdl.DataType object corresponding to the data type of the column.
possible_values	Optional [list]	None	A list of unique values used for categorical columns.
is_nullable	Optional [bool]	None	If True, will expect missing values in the column.
value_range_min	Optional [float]	None	The minimum value used for numeric columns.
value_range_max	Optional [float]	None	The maximum value used for numeric columns.

column = fdl.Column(
    name='feature_1',
    data_type=fdl.DataType.FLOAT,
    value_range_min=0.0,
    value_range_max=80.0
)

fdl.DeploymentParams

Supported from server version 23.1 and above with Model Deployment feature enabled.

Input Parameter	Type	Default	Description
image_uri	Optional[str]	md-base/python/machine-learning:1.0.1	Reference to the docker image to create a new runtime to serve the model. Check the available images on the Model Deployment page.
replicas	Optional[int]	1	The number of replicas running the model. Minimum value: 1 Maximum value: 10 Default value: 1
memory	Optional[int]	256	The amount of memory (mebibytes) reserved per replica. Minimum value: 150 Maximum value: 16384 (16GiB) Default value: 256
cpu	Optional[int]	100	The amount of CPU (milli cpus) reserved per replica. Minimum value: 10 Maximum value: 4000 (4vCPUs) Default value: 100

deployment_params = fdl.DeploymentParams(
        image_uri="md-base/python/machine-learning:1.1.0",
        cpu=250,
        memory=512,
  		  replicas=1,
)

📘
What parameters should I set for my model?
Setting the right parameters might not be straightforward and Fiddler is here to help you.
The parameters might vary depending the number of input features used, the pre-processing steps used and the model itself.
This table is helping you defining the right parameters

Surrogate Models guide

Number of input features	Memory (mebibytes)	CPU (milli cpus)
< 10	250 (default)	100 (default)
< 20	400	300
< 50	600	400
<100	850	900
<200	1600	1200
<300	2000	1200
<400	2800	1300
<500	2900	1500

User Uploaded guide

For uploading your artifact model, refer to the table above and increase the memory number, depending on your model framework and complexity. Surrogate models use lightgbm framework.

For example, an NLP model for a TEXT input might need memory set at 1024 or higher and CPU at 1000.

📘
Usage Reference
See the usage with:

add_model_artifact

add_model_surrogate

update_model_artifact

update_model_surrogate

Check more about the Model Deployment feature set.

fdl.ComparePeriod

Required when compare_to = CompareTo.TIME_PERIOD, this field is used to set when comparing against the same bin for a previous time period. Choose from the following:

Enums	values
fdl.ComparePeriod.ONE_DAY	86400000 millisecond i.e 1 day
fdl.ComparePeriod.SEVEN_DAYS	604800000 millisecond i.e 7 days
fdl.ComparePeriod.ONE_MONTH	2629743000 millisecond i.e 30 days
fdl.ComparePeriod.THREE_MONTHS	7776000000 millisecond i.e 90 days

import fiddler as fdl

client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_name = 'project-a',
    model_name = 'model-a',
    alert_type = fdl.AlertType.PERFORMANCE, 
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY, <----
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE, 
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY, <----
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]

fdl.AlertCondition

If condition = fdl.AlertCondition.GREATER/LESSER is specified, and an alert is triggered every time the metric value is greater/lesser than the specified threshold.

Enum	Value
fdl.AlertCondition.GREATER	greater
fdl.AlertCondition.LESSER	lesser

import fiddler as fdl

client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_name = 'project-a',
    model_name = 'model-a',
    alert_type = fdl.AlertType.PERFORMANCE, 
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER, <-----
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE, <---
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER, <-----
           bin_size=BinSize.ONE_HOUR)]

fdl.CompareTo

Whether the metric value is to be compared against a static value or the same time bin from a previous time period(set using compare_period[ComparePeriod]).

Enums	Value
fdl.CompareTo.RAW_VALUE	When comparing to an absolute value
fdl.CompareTo.TIME_PERIOD	When comparing to the same bin size from a previous time period

import fiddler as fdl

client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_name = 'project-a',
    model_name = 'binary_classification_model-a',
    alert_type = fdl.AlertType.PERFORMANCE,
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.TIME_PERIOD, <----
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='binary_classification_model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE,
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD, <---
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]

fdl.BinSize

This field signifies the durations for which fiddler monitoring calculates the metric values

Enums	Values
fdl.BinSize.ONE_HOUR	3600 * 1000 millisecond i.e one hour
fdl.BinSize.ONE_DAY	86400 * 1000 millisecond i.e one day
fdl.BinSize.SEVEN_DAYS	604800 * 1000 millisecond i.e seven days

import fiddler as fdl

client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_name = 'project-a',
    model_name = 'model-a',
    alert_type = fdl.AlertType.PERFORMANCE, 
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR, <----
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE, 
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)] <-----

fdl.Priority

This field can be used to prioritize the alert rules by adding an identifier - low, medium, and high to help users better categorize them on the basis of their importance. Following are the Priority Enums:

Enums	Values
fdl.Priority.HIGH	HIGH
fdl.Priority.MEDIUM	MEDIUM
fdl.Priority.LOW	LOW

import fiddler as fdl

client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_name = 'project-a',
    model_name = 'model-a',
    alert_type = fdl.AlertType.PERFORMANCE, 
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH, <---
    notifications_config = notifications_config
)

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE, 
           metric=Metric.PRECISION,
           priority=Priority.HIGH, <----
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]

fdl.Metric

Following is the list of metrics, with corresponding alert type and model task, for which an alert rule can be created.

Enum Values	Supported for Alert Types (ModelTask restriction if any)	Description
fdl.Metric.SUM	fdl.AlertType.STATISTIC	Sum of all values of a column across all events
fdl.Metric.AVERAGE	fdl.AlertType.STATISTIC	Average value of a column across all events
fdl.Metric.FREQUENCY	fdl.AlertType.STATISTIC	Frequency count of a specific value in a categorical column
fdl.Metric.PSI	fdl.AlertType.DATA_DRIFT	Population Stability Index
fdl.Metric.JSD	fdl.AlertType.DATA_DRIFT	Jensen–Shannon divergence
fdl.Metric.MISSING_VALUE	fdl.AlertType.DATA_INTEGRITY	Missing Value
fdl.Metric.TYPE_VIOLATION	fdl.AlertType.DATA_INTEGRITY	Type Violation
fdl.Metric.RANGE_VIOLATION	fdl.AlertType.DATA_INTEGRITY	Range violation
fdl.Metric.TRAFFIC	fdl.AlertType.SERVICE_METRICS	Traffic Count
fdl.Metric.ACCURACY	fdl.AlertType.PERFORMANCE (fdl.ModelTask.BINARY_CLASSIFICATION, fdl.ModelTask.MULTICLASS_CLASSIFICATION)	Accuracy
fdl.Metric.RECALL	fdl.AlertType.PERFORMANCE (fdl.ModelTask.BINARY_CLASSIFICATION)	Recall
fdl.Metric.FPR	fdl.AlertType.PERFORMANCE (fdl.ModelTask.BINARY_CLASSIFICATION)	False Positive Rate
fdl.Metric.PRECISION	fdl.AlertType.PERFORMANCE (fdl.ModelTask.BINARY_CLASSIFICATION)	Precision
fdl.Metric.TPR	fdl.AlertType.PERFORMANCE (fdl.ModelTask.BINARY_CLASSIFICATION)	True Positive Rate
fdl.Metric.AUC	fdl.AlertType.PERFORMANCE (fdl.ModelTask.BINARY_CLASSIFICATION)	Area under the ROC Curve
fdl.Metric.F1_SCORE	fdl.AlertType.PERFORMANCE (fdl.ModelTask.BINARY_CLASSIFICATION)	F1 score
fdl.Metric.ECE	fdl.AlertType.PERFORMANCE (fdl.ModelTask.BINARY_CLASSIFICATION)	Expected Calibration Error
fdl.Metric.R2	fdl.AlertType.PERFORMANCE (fdl.ModelTask.REGRESSION)	R Squared
fdl.Metric.MSE	fdl.AlertType.PERFORMANCE (fdl.ModelTask.REGRESSION)	Mean squared error
fdl.Metric.MAPE	fdl.AlertType.PERFORMANCE (fdl.ModelTask.REGRESSION)	Mean Absolute Percentage Error
fdl.Metric.WMAPE	fdl.AlertType.PERFORMANCE (fdl.ModelTask.REGRESSION)	Weighted Mean Absolute Percentage Error
fdl.Metric.MAE	fdl.AlertType.PERFORMANCE (fdl.ModelTask.REGRESSION)	Mean Absolute Error
fdl.Metric.LOG_LOSS	fdl.AlertType.PERFORMANCE (fdl.ModelTask.MULTICLASS_CLASSIFICATION)	Log Loss
fdl.Metric.MAP	fdl.AlertType.PERFORMANCE (fdl.ModelTask.RANKING)	Mean Average Precision
fdl.Metric.MEAN_NDCG	fdl.AlertType.PERFORMANCE (fdl.ModelTask.RANKING)	Normalized Discounted Cumulative Gain

import fiddler as fdl

client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_name = 'project-a',
    model_name = 'binary_classification_model-a',
    alert_type = fdl.AlertType.PERFORMANCE,
    metric = fdl.Metric.PRECISION, <----
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='binary_classification_model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE,
           metric=Metric.PRECISION, <---
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]

fdl.AlertType

Enum Value	Description
fdl.AlertType.DATA_DRIFT	For drift alert type
fdl.AlertType.PERFORMANCE	For performance alert type
fdl.AlertType.DATA_INTEGRITY	For data integrity alert type
fdl.AlertType.SERVICE_METRICS	For service metrics alert type
fdl.AlertType.STATISTIC	For statistics of a feature

client.add_alert_rule(
    name = "perf-gt-5prec-1hr-1d-ago",
    project_name = 'project-a',
    model_name = 'model-a',
    alert_type = fdl.AlertType.PERFORMANCE, <---
    metric = fdl.Metric.PRECISION,
    bin_size = fdl.BinSize.ONE_HOUR, 
    compare_to = fdl.CompareTo.TIME_PERIOD,
    compare_period = fdl.ComparePeriod.ONE_DAY,
    warning_threshold = 0.05,
    critical_threshold = 0.1,
    condition = fdl.AlertCondition.GREATER,
    priority = fdl.Priority.HIGH,
    notifications_config = notifications_config
)

[AlertRule(alert_rule_uuid='9b8711fa-735e-4a72-977c-c4c8b16543ae',
           organization_name='some_org_name',
           project_id='project-a',
           model_id='model-a',
           name='perf-gt-5prec-1hr-1d-ago',
           alert_type=AlertType.PERFORMANCE, <---
           metric=Metric.PRECISION,
           priority=Priority.HIGH,
           compare_to='CompareTo.TIME_PERIOD,
           compare_period=ComparePeriod.ONE_DAY,
           compare_threshold=None,
           raw_threshold=None,
           warning_threshold=0.05,
           critical_threshold=0.1,
           condition=AlertCondition.GREATER,
           bin_size=BinSize.ONE_HOUR)]

fdl.WindowSize

Enum	Value
fdl.WindowSize.ONE_HOUR	3600
fdl.WindowSize.ONE_DAY	86400
fdl.WindowSize.ONE_WEEK	604800
fdl.WindowSize.ONE_MONTH	2592000

from fiddler import BaselineType, WindowSize

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_rolling'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'

client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.ROLLING_PRODUCTION,
  offset=WindowSize.ONE_MONTH, # How far back to set our window
  window_size=WindowSize.ONE_WEEK, # Size of the sliding window
)

fdl.CustomFeatureType

Enum	Value
FROM_COLUMNS	Represents custom features derived directly from columns.
FROM_VECTOR	Represents custom features derived from a vector column.
FROM_TEXT_EMBEDDING	Represents custom features derived from text embeddings.
FROM_IMAGE_EMBEDDING	Represents custom features derived from image embeddings.
ENRICHMENT	Represents enrichment custom feature.

fdl.CustomFeature

This is the base class that all other custom features inherit from. It's flexible enough to accommodate different types of derived features. Note: All of the derived feature classes (e.g., Multivariate, VectorFeature, etc.) inherit from CustomFeature and thus have its properties, in addition to their specific ones.

Input Parameter	Type	Default	Description
name	str	None	The name of the custom feature.
type	CustomFeatureType	None	The type of custom feature. Must be one of the `CustomFeatureType` enum values.
n_clusters	Optional[int]	5	The number of clusters.
centroids	Optional[List]	None	Centroids of the clusters in the embedded space. Number of centroids equal to `n_clusters`.
columns	Optional[List[str]]	None	For `FROM_COLUMNS` type, represents the original columns from which the feature is derived.
column	Optional[str]	None	Used for vector-derived features, the original vector column name.
source_column	Optional[str]	None	Specifies the original column name for embedding-derived features.
n_tags	Optional[int]	5	For `FROM_TEXT_EMBEDDING` type, represents the number of tags for each cluster in the `tfidf` summarization in drift computation.

# use from_columns helper function to generate a custom feature combining multiple numeric columns

feature = fdl.CustomFeature.from_columns(
    name='my_feature',
    columns=['column_1', 'column_2'],
    n_clusters=5
)

fdl.Multivariate

Represents custom features derived from multiple columns.

Input Parameter	Type	Default	Description
columns	List[str]	None	List of original columns from which this feature is derived.
n_clusters	Optional[int]	5	The number of clusters.
centroids	Optional[List]	Centroids of the clusters in the embedded space. Number of centroids equal to `n_clusters`	Centroids of the clusters in the embedded space. Number of centroids equal to `n_clusters`
monitor_components	bool	False	Whether to monitor each column in `columns` as individual feature. If set to `True`, components are monitored and drift will be available.

multivariate_feature = fdl.Multivariate(
    name='multi_feature',
    columns=['column_1', 'column_2']
)

fdl.VectorFeature

Represents custom features derived from a single vector column.

Input Parameter	Type	Default	Description
source_column	Optional[str]	None	Specifies the original column if this feature is derived from an embedding.
column	str	None	The vector column name.
n_clusters	Optional[int]	5	The number of clusters.
centroids	Optional[List[List[float]]]	Centroids of the clusters in the embedded space. Number of centroids equal to `n_clusters`	Centroids of the clusters in the embedded space. Number of centroids equal to `n_clusters`

vector_feature = fdl.VectorFeature(
    name='vector_feature',
    column='vector_column'
)

fdl.TextEmbedding

Represents custom features derived from text embeddings.

Input Parameter	Type	Default	Description
source_column	str	Required	Specifies the column name where text data (e.g. LLM prompts) is stored
column	str	Required	Specifies the column name where the embeddings corresponding to source_col are stored
n_tags	Optional[int]	5	How many tags(tokens) the text embedding are used in each cluster as the `tfidf` summarization in drift computation.
n_clusters	Optional[int]	5	The number of clusters.
centroids	Optional[List]	Centroids of the clusters in the embedded space. Number of centroids equal to `n_clusters`	Centroids of the clusters in the embedded space. Number of centroids equal to `n_clusters`

text_embedding_feature = TextEmbedding(
    name='text_custom_feature',
    source_column='text_column',
    column='text_embedding',
    n_tags=10
)

fdl.ImageEmbedding

Represents custom features derived from image embeddings.

Input Parameter	Type	Default	Description
source_column	str	Required	URL where image data is stored
column	str	Required	Specifies the column name where embeddings corresponding to source_col are stored.
n_clusters	Optional[int]	5	The number of clusters
centroids	Optional[List]	Centroids of the clusters in the embedded space. Number of centroids equal to `n_clusters`	Centroids of the clusters in the embedded space. Number of centroids equal to `n_clusters`

image_embedding_feature = fdl.ImageEmbedding(
    name='image_feature',
    source_column='image_url',
    column='image_embedding',
)

fdl.Enrichment (beta)

Enrichments are custom features designed to augment data provided in events.
They add new computed columns to your published data automatically whenever defined.
The new columns generated are available for querying in analyze, charting, and alerting, similar to any other column.

Input Parameter	Type	Default	Description
name	str		The name of the custom feature to generate
enrichment	str		The enrichment operation to be applied
columns	List[str]		The column names on which the enrichment depends
config	Optional[List]	{}	(optional): Configuration specific to an enrichment operation which controls the behavior of the enrichment

# Automatically generating embedding for a column named “question”

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features=[
        fdl.Enrichment(
            name='question_embedding',
            enrichment='embedding',
            columns=['question'],
        ),
        fdl.TextEmbedding(
            name='question_cf',
            source_column='question',
            column='question_embedding',
        ),
    ]
)

Note

Enrichments are disabled by default. To enable them, contact your administrator. Failing to do so will result in an error during the add_model call.

Embedding (beta)

Create an embedding for a string column using an embedding model.
Supports Sentence transformers and Encoder/Decoder NLP transformers from Hugging Face.
To enable set enrichment parameter toembedding.
For each embedding enrichment, if you want to monitor the embedding vector on fiddler you MUST create a corresponding TextEmbedding using the enrichment’s output column.

Requirements:

Access to Huggingface inference endpoint - https://api-inference.huggingface.co
Huggingface API token

Supported Models:

model_name	size	Type	pooling_method	Notes
BAAI/bge-small-en-v1.5	small	Sentence Transformer
sentence-transformers/all-MiniLM-L6-v2	med	Sentence Transformer
thenlper/gte-base	med	Sentence Transformer		*(default)*
gpt2	med	Encoder NLP Transformer	last_token
distilgpt2	small	Encoder NLP Transformer	last_token
EleuteherAI/gpt-neo-125m	med	Encoder NLP Transformer	last_token
google/bert_uncased_L-4_H-256_A-4	small	Decoder NLP Transformer	first_token	Smallest Bert
bert-base-cased	med	Decoder NLP Transformer	first_token
distilroberta-base	med	Decoder NLP Transformer	first_token
xlm-roberta-large	large	Decoder NLP Transformer	first_token	Multilingual
roberta-large	large	Decoder NLP Transformer	first_token

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
          name='Question Embedding', # name of the enrichment, will be the vector col
          enrichment='embedding', 
          columns=['question'], # only one allowed per embedding enrichment, must be a text column in dataframe
          config={ # optional
            'model_name': ... # default: 'thenlper/gte-base'
            'pooling_method': ... # choose from '{first/last/mean}_token'. Only required if NOT using a sentence transformer
          }
      ),
      fdl.TextEmbedding(
        name='question_cf', # name of the text embedding custom feature
        source_column='question', # source - raw text
        column='Question Embedding', # the name of the vector - outpiut of the embedding enrichment
      ),
    ]
)

The above example will lead to generation of new column

FDL Question Embedding(vector) : embeddings corresponding to string column question

Note

In the context of Hugging Face models, particularly transformer-based models used for generating embeddings, the pooling_method determines how the model processes the output of its layers to produce a single vector representation for input sequences (like sentences or documents). This is crucial when using these models for tasks like sentence or document embedding, where you need a fixed-size vector representation regardless of the input length.

Centroid Distance (beta)

Fiddler uses KMeans based system to determine which cluster a particular CustomFeature belongs to.
This Centroid Distance enrichment calculates the distance from the closest centroid calculated by model monitoring.
A new numeric column with distances to the closest centroid is added to the events table.
To enable set enrichment parameter tocentroid_distance.

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
        name='question_embedding',
        enrichment='embedding',
        columns=['question'],
      ),
      fdl.TextEmbedding(
          name='question_cf',
          source_column='question',
          column='question_embedding',
      ),
      fdl.Enrichment(
        name='Centroid Distance',
        enrichment='centroid_distance',
        columns=['question_cf'],
      ),
    ]
)

The above example will lead to generation of new column

FDL Centroid Distance (question_embedding)(float) : distance from the nearest K-Means centroid present in
question_embedding

Note

Does not calculate membership for preproduction data, so you cannot calculate drift.

Personally Identifiable Information (beta)

The PII (Personally Identifiable Information) enrichment is a critical tool designed to detect and flag the presence of sensitive information within textual data. Whether user-entered or system-generated, this enrichment aims to identify instances where PII might be exposed, helping to prevent privacy breaches and the potential misuse of personal data. In an era where digital privacy concerns are paramount, mishandling or unintentionally leaking PII can have serious repercussions, including privacy violations, identity theft, and significant legal and reputational damage.

Regulatory frameworks such as the General Data Protection Regulation (GDPR) in the European Union and the Health Insurance Portability and Accountability Act (HIPAA) in the United States underscore the necessity of safeguarding PII. These laws enforce strict guidelines on the collection, storage, and processing of personal data, emphasizing the need for robust measures to protect sensitive information.

The inadvertent inclusion of PII in datasets used for training or interacting with large language models (LLMs) can exacerbate the risks associated with data privacy. Once exposed to an LLM, sensitive information can be inadvertently learned by the model, potentially leading to wider dissemination of this data beyond intended confines. This scenario underscores the importance of preemptively identifying and removing PII from data before it is processed or shared, particularly in contexts involving AI and machine learning.

To mitigate the risks associated with PII exposure, organizations and developers can integrate the PII enrichment into their data processing workflows. This enrichment operates by scanning text for patterns and indicators of personal information, flagging potentially sensitive data for review or anonymization. By proactively identifying PII, stakeholders can take necessary actions to comply with privacy laws, safeguard individuals' data, and prevent the unintended spread of personal information through AI models and other digital platforms. Implementing PII detection and management practices is not just a legal obligation but a critical component of responsible data stewardship in the digital age.

To enable set enrichment parameter topii.

Requirements

Reachability to https://github.com/explosion/spacy-models/releases/download/ to download spacy models as required

List of PII entities

Entity Type	Description	Detection Method	Example
CREDIT_CARD	A credit card number is between 12 to 19 digits. https://en.wikipedia.org/wiki/Payment_card_number	Pattern match and checksum	`4111111111111111` `378282246310005` (American Express)
CRYPTO	A Crypto wallet number. Currently only Bitcoin address is supported	Pattern match, context and checksum	`1BoatSLRHtKNngkdXEeobR76b53LETtpyT`
DATE_TIME	Absolute or relative dates or periods or times smaller than a day.	Pattern match and context	01/01/2024
EMAIL_ADDRESS	An email address identifies an email box to which email messages are delivered	Pattern match, context and RFC-822 validation	`[email protected]`
IBAN_CODE	The International Bank Account Number (IBAN) is an internationally agreed system of identifying bank accounts across national borders to facilitate the communication and processing of cross border transactions with a reduced risk of transcription errors.	Pattern match, context and checksum	`DE89 3704 0044 0532 0130 00`
IP_ADDRESS	An Internet Protocol (IP) address (either IPv4 or IPv6).	Pattern match, context and checksum	`1.2.3.4` `127.0.0.12/16` `1234:BEEF:3333:4444:5555:6666:7777:8888`
LOCATION	Name of politically or geographically defined location (cities, provinces, countries, international regions, bodies of water, mountains	Custom logic and context	PALO ALTO Japan
PERSON	A full person name, which can include first names, middle names or initials, and last names.	Custom logic and context	Joanna Doe
PHONE_NUMBER	A telephone number	Custom logic, pattern match and context	`5556667890`
URL	A URL (Uniform Resource Locator), unique identifier used to locate a resource on the Internet	Pattern match, context and top level url validation	www.fiddler.ai
US SSN	A US Social Security Number (SSN) with 9 digits.	Pattern match and context	`1234-00-5678`
US_DRIVER_LICENSE	A US driver license according to https://ntsi.com/drivers-license-format/	Pattern match and context
US_ITIN	US Individual Taxpayer Identification Number (ITIN). Nine digits that start with a "9" and contain a "7" or "8" as the 4 digit.	Pattern match and context	912-34-1234
US_PASSPORT	A US passport number begins with a letter, followed by eight numbers	Pattern match and context	L12345678
US_SSN	A US Social Security Number (SSN) with 9 digits.	Pattern match and context	001-12-1234

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
        name='Rag PII',
        enrichment='pii',
        columns=['question'], # one or more columns
        allow_list=['fiddler'], # Optional: list of strings that are white listed
        score_threshold=0.85, # Optional: float value for minimum possible confidence 
      ),
    ]
)

The above example will lead to generation of new columns:

FDL Rag PII (question) (bool) : whether any PII was detected
FDL Rag PII (question) Matches (str) : what matches in raw text were flagged as potential PII (ex. ‘Douglas MacArthur,Korean’)
FDL Rag PII (question) Entities (str) : what entites these matches were tagged as (ex. 'PERSON')

Note

PII enrichment is integrated with Presidio

Evaluate (beta)

Calculates classic Metrics for evaluating QA results like Bleu, Rouge and Meteor.
To enable set enrichment parameter toevaluate.
Make sure the reference_col and prediction_col are set in the configof Enrichment.

Here is a summary of the three evaluation metrics for natural language generation:

Metric	Description	Strengths	Limitations
bleu	Measures precision of word n-grams between generated and reference texts	Simple, fast, widely used	Ignores recall, meaning, and word order
rouge	Measures recall of word n-grams and longest common sequences	Captures more information than BLEU	Still relies on word matching, not semantic similarity
meteor	Incorporates recall, precision, and additional semantic matching based on stems and paraphrasing	More robust and flexible than BLEU and ROUGE	Requires linguistic resources and alignment algorithms

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
        name='QA Evaluate',
        enrichment='evaluate',
        columns=['correct_answer', 'generated_answer'],
        config={
            'reference_col': 'correct_answer', # required
            'prediction_col': 'generated_answer', # required
            'metrics': ..., # optional, default - ['bleu', 'rouge' , 'meteor']
        }
      ),
    ]
)

The above example generates 6 new columns

FDL QA Evaluate (bleu)(float)
FDL QA Evaluate (rouge1)(float)
FDL QA Evaluate (rouge2)(float)
FDL QA Evaluate (rougel)(float)
FDL QA Evaluate (rougelsum)(float)
FDL QA Evaluate (meteor)(float)

Textstat (beta)

Generates statistics on string columns.

To enable set enrichment parameter totextstat.

Supported Statistics

Statistic	Description	Usage
char_count	Total number of characters in text, including everything.	Assessing text length, useful for platforms with character limits.
letter_count	Total number of letters only, excluding numbers, punctuation, spaces.	Gauging text complexity, used in readability formulas.
miniword_count	Count of small words (usually 1-3 letters).	Specific readability analyses, especially for simplistic texts.
words_per_sentence	Average number of words in each sentence.	Understanding sentence complexity and structure.
polysyllabcount	Number of words with more than three syllables.	Analyzing text complexity, used in some readability scores.
lexicon_count	Total number of words in the text.	General text analysis, assessing overall word count.
syllable_count	Total number of syllables in the text.	Used in readability formulas, measures text complexity.
sentence_count	Total number of sentences in the text.	Analyzing text structure, used in readability scores.
flesch_reading_ease	Readability score indicating how easy a text is to read (higher scores = easier).	Assessing readability for a general audience.
smog_index	Measures years of education needed to understand a text.	Evaluating text complexity, especially for higher education texts.
flesch_kincaid_grade	Grade level associated with the complexity of the text.	Educational settings, determining appropriate grade level for texts.
coleman_liau_index	Grade level needed to understand the text based on sentence length and letter count.	Assessing readability for educational purposes.
automated_readability_index	Estimates the grade level needed to comprehend the text.	Evaluating text difficulty for educational materials.
dale_chall_readability_score	Assesses text difficulty based on a list of familiar words for average American readers.	Determining text suitability for average readers.
difficult_words	Number of words not on a list of commonly understood words.	Analyzing text difficulty, especially for non-native speakers.
linsear_write_formula	Readability formula estimating grade level of text based on sentence length and easy word count.	Simplifying texts, especially for lower reading levels.
gunning_fog	Estimates the years of formal education needed to understand the text.	Assessing text complexity, often for business or professional texts.
long_word_count	Number of words longer than a certain length (often 6 or 7 letters).	Evaluating complexity and sophistication of language used.
monosyllabcount	Count of words with only one syllable.	Readability assessments, particularly for simpler texts.

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
          name='Text Statistics',
          enrichment='textstat',
          columns=['question'],
          config={
          'statistics' : [
              'char_count',
              'dale_chall_readability_score',
            ]
          },
      ),
    ]
)

The above example leads to the creation of two additional columns

FDL Text Statistics (question) char_count(int) : character count of string in
questioncolumn
FDL Text Statistics (question) dale_chall_readability_score(float) : readability score of string in
question column

Sentiment (beta)

Sentiment Analysis enrichment employs advanced natural language processing (NLP) techniques to gauge the emotional tone behind a body of text. This enrichment is designed to determine whether the sentiment of textual content is positive, negative, or neutral, providing valuable insights into the emotions and opinions expressed within. By analyzing the sentiment, this tool offers a powerful means to understand user feedback, market research responses, social media commentary, and any textual data where opinion and mood are significant.

Implementing Sentiment Analysis into your data processing allows for a nuanced understanding of how your audience feels about a product, service, or topic, enabling informed decision-making and strategy development. It's particularly useful in customer service and brand management, where gauging customer sentiment is crucial for addressing concerns, improving user experience, and building brand reputation.

The Sentiment enrichment uses NLTK's VADER lexicon to generate a score and corresponding sentiment for all specified columns. For each string column on which sentiment enrichment is enabled, two additional columns are added. To enable set enrichment parameter tosentiment.

Requirements

Reachability to www.nltk.org/nltk_data to download latest vader lexicon

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
          name='Question Sentiment',
          enrichment='sentiment',
          columns=['question'],
      ),
    ]
)

The above example leads to creation of two columns -

FDL Question Sentiment (question) compound(float): raw score of sentiment
FDL Question Sentiment (question) sentiment(string): one of positive, negative and neutral

Profanity (beta)

The Profanity enrichment is designed to detect and flag the use of offensive or inappropriate language within textual content. This enrichment is essential for maintaining the integrity and professionalism of digital platforms, forums, social media, and any user-generated content areas. It helps ensure that conversations and interactions remain respectful and free from language that could be considered harmful or offensive to users.

In the digital space, where diverse audiences come together, the presence of profanity can lead to negative user experiences, damage brand reputation, and create an unwelcoming environment. Implementing a profanity filter is a proactive measure to prevent such outcomes, promoting a positive and inclusive online community.

Beyond maintaining community standards, the Profanity enrichment has practical implications for compliance with platform guidelines and legal regulations concerning hate speech and online conduct. Many digital platforms have strict policies against the use of profane or offensive language, making it crucial for content creators and moderators to actively monitor and manage such language.

By integrating the Profanity enrichment into their content moderation workflow, businesses and content managers can automate the detection of inappropriate language, significantly reducing manual review efforts. This enrichment not only helps in upholding community guidelines and legal standards but also supports the creation of safer and more respectful online spaces for all users.

Profanity enrichment works off a list of profane terms created from the following two sources
- The Obscenity List from https://github.com/surge-ai/profanity/blob/main/profanity_en.csv
- Google banned words https://github.com/coffee-and-fun/google-profanity-words/blob/main/data/en.txt
To enable set enrichment parameter toprofanityand make sure to specify output_column_name in the config as shown in the example below.

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
            name='Profanity',
            enrichment='profanity',
            columns=['prompt', 'response'],
            config={'output_column_name': 'contains_profanity'},
        ),
    ]
)

The above example leads to creation of two columns -

FDL Profanity (prompt) contains_profanity(bool): to indicate if input contains profanity in the value of the prompt column
FDL Profanity (response) contains_profanity(bool): to indicate if input contains profanity in the value of the response column

Answer Relevance (beta)

The Answer Relevance is a specialized enrichment designed to evaluate the pertinence of AI-generated responses to their corresponding prompts. This enrichment operates by assessing whether the content of a response accurately addresses the question or topic posed by the initial prompt, providing a simple yet effective binary outcome: relevant or not relevant. Its primary function is to ensure that the output of AI systems, such as chatbots, virtual assistants, and content generation models, remains aligned with the user's informational needs and intentions.

In the context of AI-generated content, ensuring relevance is crucial for maintaining user engagement and trust. Irrelevant or tangentially related responses can lead to user frustration, decreased satisfaction, and diminished trust in the AI's capabilities. The Answer Relevance metric serves as a critical checkpoint, verifying that interactions and content deliveries meet the expected standards of accuracy and pertinence.

This enrichment finds its application across a wide range of AI-driven platforms and services where the quality of the response directly impacts the user experience. From customer service bots answering inquiries to educational tools providing study assistance, the ability to automatically gauge the relevance of responses enhances the effectiveness and reliability of these services.

Incorporating the Answer Relevance enrichment into the development and refinement of AI models enables creators to iteratively improve their systems based on relevant feedback. By identifying instances where the model generates non-relevant responses, developers can adjust and fine-tune their models to better meet user expectations. This continuous improvement cycle is essential for advancing the quality and utility of AI-generated content, ensuring that it remains focused, accurate, and highly relevant to users' needs.

To enable set enrichment parameter toanswer_relevance.

Requirements:

This enrichment requires access to the OpenAI API, which may introduce latency due to network communication and processing time. Learn more about LLM based enrichments
OpenAI API access token MUST BE provided by the user.

answer_relevance_config = {
  'prompt' : 'prompt_col',
  'response' : 'response_col',
}

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
          name = 'Answer Relevance',
          enrichment = 'answer_relevance', 
          columns = ['prompt_col', 'response_col'],
          config = answer_relevance_config,
      ),
    ]
)

The above example will lead to the generation of a new column

FDL Answer Relevance: Binary metric, which is True if response is relevant to the prompt

Faithfulness (beta)

The Faithfulness (Groundedness) enrichment is a binary indicator designed to evaluate the accuracy and reliability of facts presented in AI-generated text responses. It specifically assesses whether the information used in the response correctly aligns with and is grounded in the provided context, often in the form of referenced documents or data. This enrichment plays a critical role in ensuring that the AI's outputs are not only relevant but also factually accurate, based on the context it was given.

In practical applications, such as automated content creation, customer support, and informational queries, the Faithfulness (Groundedness) metric serves as a safeguard against the dissemination of misinformation. It verifies that the AI system's responses are not only generated with a high level of linguistic fluency but also reflect a true and correct use of the available information (retrieved documents).

This enrichment is particularly important in fields where accuracy is paramount, such as in educational content, medical advice, or factual reporting. By implementing the Faithfulness (Groundedness )metric, developers and researchers can enhance the trustworthiness of AI-generated content, ensuring that users receive responses that are not only contextually relevant but also factually sound. The effectiveness of this enrichment hinges on its ability to critically analyze the alignment between the generated content and the context provided, promoting a higher standard of reliability in AI-generated outputs.

To enable set enrichment parameter tofaithfulness.

Requirements:

This enrichment requires access to the OpenAI API, which may introduce latency due to network communication and processing time. Learn more about LLM based enrichments
OpenAI API access token MUST BE provided by the user.

faithfulness_config = {
  'context' : ['doc_0', 'doc_1', 'doc_2'],
  'response' : 'response_col',
}

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
          name = 'Faithfulness',
          enrichment = 'faithfulness', 
          columns = ['doc_0', 'doc_1', 'doc_2', 'response_col'],
          config = faithfulness_config,
      ),
    ]
)

The above example will lead to generation of new column

FDL Faithfulness: Binary metric, which is True if the facts used inresponse is correctly used from the context columns.

Coherence (beta)

The Coherence enrichment assesses the logical flow and clarity of AI-generated text responses, ensuring they are structured in a way that makes sense from start to finish. This enrichment is crucial for evaluating whether the content produced by AI maintains a consistent theme, argument, or narrative, without disjointed thoughts or abrupt shifts in topic. Coherence is key to making AI-generated content not only understandable but also engaging and informative for the reader.

In applications ranging from storytelling and article generation to customer service interactions, coherence determines the effectiveness of communication. A coherent response builds trust in the AI's capabilities, as it demonstrates an understanding of not just language, but also context and the natural progression of ideas. This enrichment encourages AI systems to produce content that flows naturally, mimicking the way a knowledgeable human would convey information or tell a story.

For developers, integrating the Coherence enrichment into AI evaluation processes is essential for achieving outputs that resonate with human readers. It helps in fine-tuning AI models to produce content that not only answers questions or provides information but does so in a way that is logical and easy to follow. By prioritizing coherence, AI-generated texts can better serve their intended purpose, whether to inform, persuade, or entertain, enhancing the overall quality and impact of AI communications.

To enable set enrichment parameter tocoherence.

Requirements:

This enrichment requires access to the OpenAI API, which may introduce latency due to network communication and processing time. Learn more about LLM based enrichments
OpenAI API access token MUST BE provided by the user.

coherence_config = {
  'response' : 'response_col',
}

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
          name = 'Coherence',
          enrichment = 'coherence', 
          columns = ['response_col'],
          config = coherence_config,
      ),
    ]
)

The above example will lead to generation of new column

FDL Coherence: Binary metric, which is True ifresponse makes coherent arguments which flow well.

Conciseness (beta)

The Conciseness enrichment evaluates the brevity and clarity of AI-generated text responses, ensuring that the information is presented in a straightforward and efficient manner. This enrichment identifies and rewards responses that effectively communicate their message without unnecessary elaboration or redundancy. In the realm of AI-generated content, where verbosity can dilute the message's impact or confuse the audience, maintaining conciseness is crucial for enhancing readability and user engagement.

Implementing the Conciseness metric can significantly improve the user experience across various applications, from chatbots and virtual assistants to summarization tools and content generation platforms. It encourages the AI to distill information down to its essence, providing users with clear, to-the-point answers that satisfy their queries or needs without overwhelming them with superfluous details.

For developers and content creators, the Conciseness serves as a valuable tool for refining the output of AI systems, aligning them more closely with human preferences for communication that is both efficient and effective. By prioritizing conciseness, AI-generated content can become more accessible and useful, meeting the high standards of users who value quick and accurate information delivery. This enrichment, therefore, plays a pivotal role in the ongoing effort to enhance the quality and utility of AI-generated text, making it an indispensable component of AI evaluation frameworks.

To enable set enrichment parameter toconciseness.

Requirements:

This enrichment requires access to the OpenAI API, which may introduce latency due to network communication and processing time. Learn more about LLM based enrichments
OpenAI API access token MUST BE provided by the user.

conciseness_config = {
  'response' : 'response_col',
}

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
          name = 'Conciseness',
          enrichment = 'conciseness', 
          columns = ['response_col'],
          config = coherence_config,
      ),
    ]
)

The above example will lead to generation of new column

FDL Conciseness: Binary metric, which is True ifresponse is concise, and not overly verbose.

Toxicity (beta)

The toxicity enrichment classifies whether a piece of text is toxic or not. A RoBERTa based model is fine-tuned with a mix of toxic and non-toxic data. The model predicts score between 0-1 where scores closer to 1 indicate toxicity.

Following table provides the model performance on Toxic-Chat dataset, a real-world dataset that contains user prompt and responses in a chatbot setting.

Dataset	PR-AUC	Precision	Recall
Toxic-Chat	0.4	0.64	0.24

Usage

The code snippet shows how to enable toxicity scoring on the prompt and response columns for each event published to Fiddler.

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
            name='Toxicity',
            enrichment='toxicity',
            columns=['prompt', 'response'],
        ),
    ]
)

The above example leads to creation of two columns each for prompt and response that contain the prediction probability and the model decision.

For example for the prompt column following two columns will be generated

FDL Toxicity (prompt) toxicity_prob (float): model prediction probability between 0-1
FDL Toxicity (prompt) contains_toxicity (bool): model prediction either 0 or 1

Regex Match (beta)

The Regex Match enrichment is designed to evaluate text responses or content based on their adherence to specific patterns defined by regular expressions (regex). By accepting a regex as input, this metric offers a highly customizable way to check if a string column in the dataset matches the given pattern. This functionality is essential for scenarios requiring precise formatting, specific keyword inclusion, or adherence to particular linguistic structures.

In practical applications, the Regex Match enrichment can be instrumental in validating data entries, ensuring compliance with formatting standards, or verifying the presence of required terms or codes within AI-generated content. Whether it's checking for email addresses, phone numbers, specific terminologies, or coding patterns, this metric provides a straightforward and efficient method for assessing the conformance of text to predefined patterns.

For developers and data analysts, the Regex Match enrichment is a powerful tool for automating the quality control process of textual data. It enables the swift identification of entries that fail to meet the necessary criteria, thereby streamlining the process of refining and improving the dataset or AI-generated content. This enrichment not only saves time but also enhances the reliability of data-driven applications by ensuring that the text outputs adhere closely to the desired specifications or standards.

Implementing the Regex Match enrichment into the evaluation and production framework of AI systems allows for a level of precision in text analysis that is crucial for applications demanding high accuracy and specificity. This metric is invaluable for maintaining the integrity and usefulness of textual data, making it a key component in the toolkit of anyone working with AI-generated content or large text datasets.

To enable set enrichment parameter toregex_match.

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    	custom_features = [
      	fdl.Enrichment(
	        name='Regex - only digits',
  	      enrichment='topic_model',
    	    columns=['question', 'response'],
      	  config = {
        	    'regex' : '^\d+$',
	        }
  	  	),
    ]
)

The above example will lead to generation of new column

FDL Regex - only digits(category) : Match or No Match, depending on the regex specified in config matching in the string.

Topic (beta)

The Topic enrichment leverages the capabilities of Zero Shot Classifier Zero Shot Classifier models to categorize textual inputs into a predefined list of topics, even without having been explicitly trained on those topics. This approach to text classification is known as zero-shot learning, a groundbreaking method in natural language processing (NLP) that allows models to intelligently classify text into categories they haven't encountered during training. It's particularly useful for applications requiring the ability to understand and organize content dynamically across a broad range of subjects or themes.

By utilizing zero-shot classification, the Topic enrichment provides a flexible and powerful tool for automatically sorting and labeling text according to relevant topics. This is invaluable for content management systems, recommendation engines, and any application needing to quickly and accurately understand the thematic content of large volumes of text.

The enrichment works by evaluating the semantic similarity between the textual input and potential topic labels, assigning the most appropriate topic based on the content. This process enables the handling of diverse and evolving content types without the need for continual retraining or manual classification, significantly reducing the effort and complexity involved in content categorization.

Implementing the Topic enrichment into your data processing workflow can dramatically enhance the organization and accessibility of textual content, making it easier to deliver relevant, targeted information to users or to analyze content themes at scale. This enrichment taps into the advanced capabilities of zero-shot classification to provide a nuanced, efficient, and adaptable tool for text categorization, essential for anyone working with diverse and dynamic textual datasets.

Requirements

Access to Huggingface inference endpoint - https://api-inference.huggingface.cowith model 'MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7'
Huggingface API token

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
          name='Topics',
          enrichment='topic_model',
          columns=['response'],
	        config={'topics':['politics', 'economy', 'astronomy']},        
      ),
    ]
)

The above example leads to creation of two columns -

FDL Topics (response) topic_model_scores(List of float): List of floats indicating probability of the given column in each of the topics specified in the Enrichment config. Each float value indicate probability of the given input classified in the corresponding topic, in the same order as topics. Each value will be between 0 and 1. The sum of values does not equal to 1, as each classification is performed independently of other topics.
FDL Topics (response) max_score_topic(string): Topic with the maximum score from the list of topic names specified in the Enrichment config.

Banned Keyword Detector (beta)

The Banned Keyword Detector enrichment is designed to scrutinize textual inputs for the presence of specified terms, particularly focusing on identifying content that includes potentially undesirable or restricted keywords. This enrichment operates based on a list of terms defined in its configuration, making it highly adaptable to various content moderation, compliance, and content filtering needs.

By specifying a list of terms to be flagged, the Banned Keyword Detector provides a straightforward yet powerful mechanism for automatically scanning and flagging content that contains certain keywords. This capability is crucial for platforms seeking to maintain high standards of content quality, adhere to regulatory requirements, or ensure community guidelines are followed. It's particularly valuable in environments where content is user-generated, providing

To enable, set enrichment parameter tobanned_keywordsand specify a list of terms in the banned_keywords config parameter.

fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    display_name='llm_model',
    model_task=fdl.core_objects.ModelTask.LLM,
    custom_features = [
      fdl.Enrichment(
            name='Banned KW',
            enrichment='banned_keywords',
            columns=['prompt', 'response'],
            config={'output_column_name': 'contains_banned_kw', 'banned_keywords':['nike', 'adidas', 'puma'],},
        ),
    ]
)

The above example leads to creation of two columns -

FDL Banned KW (prompt) contains_banned_kw(bool): to indicate if input contains one of the specified banned keywords in the value of the prompt column
FDL Banned KW (response) contains_banned_kw(bool): to indicate if input contains one of the specified banned keywords in the value of the response column

fdl.BaselineType

Enum	Description
fdl.BaselineType.PRE_PRODUCTION	Used for baselines on uploaded datasets.They can be training or validation datasets.
fdl.BaselineType.STATIC_PRODUCTION	Used to describe a baseline on production events of a model between a specific time range
fdl.BaselineType.ROLLING_PRODUCTION	Used to describe a baseline on production events of a model relative to the current time

from fiddler import BaselineType

PROJECT_NAME = 'example_project'
BASELINE_NAME = 'example_rolling'
DATASET_NAME = 'example_validation'
MODEL_NAME = 'example_model'

client.add_baseline(
  project_id=PROJECT_NAME,
  model_id=MODEL_NAME,
  baseline_id=BASELINE_NAME,
  type=BaselineType.PRE_PRODUCTION,
  dataset_id=DATASET_NAME,
)

DataSource

fdl.DatasetDataSource

Input Parameter	Type	Default	Description
dataset_id	str	None	The unique identifier for the dataset.
source	Optional[str]	None	The source file name. If not specified, using all sources from the dataset.
num_samples	Optional [int]	None	Number of samples to select for computation.

DATASET_ID = 'example_dataset'

data_source = fdl.DatasetDataSource(
    dataset_id=DATASET_ID
  	source='baseline.csv',
  	num_samples=500,
)

fdl.SqlSliceQueryDataSource

Input Parameter	Type	Default	Description
query	str	None	Slice query defining the data to use for computation.
num_samples	Optional [int]	None	Number of samples to select for computation.

DATASET_ID = 'example_dataset'
MODEL_ID = 'example_model'

query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
data_source = fdl.SqlSliceQueryDataSource(
    query=query,
  	num_samples=500,
)

fdl.RowDataSource

Input Parameter	Type	Default	Description
row	dict	None	Single row to explain as a dictionary.

row = df.to_dict(orient='records')[0]

data_source = fdl.RowDataSource(
    row=row,
)

fdl.EventIdDataSource

Input Parameter	Type	Default	Description
event_id	str	None	Single event id corresponding to the row to explain.
dataset_name	str	None	The dataset name if the event is located in the dataset table or 'production' if the event if part of the production data.

DATASET_ID = 'example_dataset'

# In Dataset table
data_source = fdl.EventIdDataSource(
    event_id='xGhys7-83HgdtsoiuYTa872',
  	dataset_name=DATASET_ID,
)

# In Production table
data_source = fdl.EventIdDataSource(
    event_id='xGhys7-83HgdtsoiuYTa872',
  	dataset_name='production',
)

fdl.Webhook

Input Parameter	Type	Default	Description
name	str	None	A unique name for the webhook.
url	str	None	The webhook url used for sending notification messages.
provider	str	None	The platform that provides webhooks functionality. Only ‘SLACK’ is supported.
uuid	str	None	A unique identifier for the webhook.
organization_name	str	None	The name of the organization in which the webhook is created.

webhook = fdl.Webhook(
    name='data_integrity_violations_channel',
    url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
    provider='SLACK',
  	uuid='74a4fdcf-34eb-4dc3-9a79-e48e14cca686',
    organization_name='some_org',
)

Example Response:

Webhook(name='data_integrity_violations_channel',
    url='https://hooks.slack.com/services/T9EAVLUQ5/P982J/G8ISUczk37hxQ15C28d',
    provider='SLACK',
  	uuid='74a4fdcf-34eb-4dc3-9a79-e48e14cca686',
    organization_name='some_org',
)

fdl.DatasetInfo

fdl.DatasetInfo.from_dataframe

fdl.DatasetInfo.from_dict

fdl.DatasetInfo.to_dict

fdl.ModelInfo

fdl.ModelInfo.from_dataset_info

fdl.ModelInfo.from_dict

fdl.ModelInfo.to_dict

fdl.WeightingParams

fdl.ModelInputType

fdl.ModelTask

fdl.DataType

fdl.Column

fdl.DeploymentParams

📘What parameters should I set for my model?

📘Usage Reference

fdl.ComparePeriod

fdl.AlertCondition

fdl.CompareTo

fdl.BinSize

fdl.Priority

fdl.Metric

fdl.AlertType

fdl.WindowSize

fdl.CustomFeatureType

fdl.CustomFeature

fdl.Multivariate

fdl.VectorFeature

fdl.TextEmbedding

fdl.ImageEmbedding

fdl.Enrichment (beta)

Embedding (beta)

Centroid Distance (beta)

Personally Identifiable Information (beta)

Evaluate (beta)

Textstat (beta)

Sentiment (beta)

Profanity (beta)

Answer Relevance (beta)

Faithfulness (beta)

Coherence (beta)

Conciseness (beta)

Toxicity (beta)

Usage

Regex Match (beta)

Topic (beta)

Banned Keyword Detector (beta)

fdl.BaselineType

DataSource

fdl.DatasetDataSource

fdl.SqlSliceQueryDataSource

fdl.RowDataSource

fdl.EventIdDataSource

fdl.Webhook

📘
What parameters should I set for my model?

📘
Usage Reference