Alerts
Alert Rule
Alert rule object contains the below fields.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the alert rule. |
name | str | - | Unique name of the alert rule. |
model | Model | - | Details of the model. |
project | Project | - | Details of the project to which the dataset belongs. |
baseline | Optional[Baseline] | None | Details of the baseline for the alert. |
segment | Optional[Segment] | None | Details of segment for the alert. |
priority | Union[str, Priority] | - | To set the priority for the alert rule. Select from: 1. Priority.LOW 2. Priority.MEDIUM 3. Priority.HIGH. |
compare_to | Union[str, CompareTo] | - | Select from the two: 1. CompareTo.RAW_VALUE 2. CompareTo.TIME_PERIOD |
metric_id | Union[str, UUID] | - | Type of alert metric UUID or string denoting metric name. |
critical_threshold | float | - | Threshold value to crossing which a critical level severity alert will be triggered. |
condition | Union[str, AlertCondition] | - | Select from: 1. AlertCondition.LESSER 2. AlertCondition.GREATER |
bin_size | Union[str, BinSize] | - | Size of the bin for alert rule. |
columns | Optional[List[str]] | None | List of column names on which alert rule is to be created. It can take ['ANY'] to check for all columns. |
baseline_id | Optional[UUID] | None | UUID of the baseline for the alert. |
segment_id | Optional[UUID] | None | UUID of segment for the alert |
compare_bin_delta | Optional[int] | None | Number of bin_size to compare the metric to a previous time period. |
warning_threshold | Optional[float] | None | Threshold value to crossing which a warning level severity alert will be triggered. |
created_at | datetime | - | Time at which alert rule was created. |
updated_at | datetime | - | Latest time at which alert rule was updated. |
constructor()
Initialise a new alert rule on Fiddler Platform.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Unique name of the model |
model_id | UUID | - | Details of the model. |
metric_id | Union[str, UUID] | - | Type of alert metric UUID or enum. |
columns | Optional[List[str]] | None | List of column names on which alert rule is to be created. It can take ['ANY'] to check for all columns. |
baseline_id | Optional[UUID] | None | UUID of the baseline for the alert. |
segment_id | Optional[UUID] | None | UUID of the segment for the alert. |
priority | Union[str, Priority] | - | To set the priority for the alert rule. Select from: 1. Priority.LOW 2. Priority.MEDIUM 3. Priority.HIGH. |
compare_to | Union[str, CompareTo] | - | Select from the two: 1. CompareTo.RAW_VALUE (absolute alert) 2. CompareTo.TIME_PERIOD (relative alert) |
compare_bin_delta | Optional[int] | None | Compare the metric to a previous time period in units of bin_size. |
warning_threshold | Optional[float] | None | Threshold value to crossing which a warning level severity alert will be triggered. |
critical_threshold | float | - | Threshold value to crossing which a critical level severity alert will be triggered. |
condition | Union[str, AlertCondition] | - | Select from: 1. AlertCondition.LESSER 2. AlertCondition.GREATER |
bin_size | Union[str, BinSize] | - | Size of the bin for alert rule. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
SEGMENT_NAME = 'test_segment'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
segment = fdl.Segment.from_name(name=SEGMENT_NAME, model_id=model.id)
alert_rule = fdl.AlertRule(
name='alert_name',
model_id=model.id,
metric_id='drift',
priority=fdl.Priority.HIGH,
compare_to=fdl.CompareTo.TIME_PERIOD,
compare_bin_delta=1,
condition=fdl.AlertCondition.GREATER,
bin_size=fdl.BinSize.HOUR,
critical_threshold=0.5,
warning_threshold=0.1,
columns=['gender', 'creditscore'],
segment_id=segment.id,
)
create()
Set a new alert rule.
Parameters
No
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
SEGMENT_NAME = 'test_segment'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
segment = fdl.Segment.from_name(name=SEGMENT_NAME, model_id=model.id)
alert_rule = fdl.AlertRule(
name='alert_name',
model_id=model.id,
metric_id='drift',
priority=fdl.Priority.HIGH,
compare_to=fdl.CompareTo.TIME_PERIOD,
compare_bin_delta=1,
condition=fdl.AlertCondition.GREATER,
bin_size=fdl.BinSize.HOUR,
critical_threshold=0.5,
warning_threshold=0.1,
columns=['gender', 'creditscore'],
segment_id=segment.id,
).create()
Returns
Return Type | Description |
---|---|
AlertRule | Alert rule instance. |
get()
Get a single alert rule.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique identifier for the alert rule. |
Usage
ALERT_RULE_ID='ed8f18e6-c319-4374-8884-71126a6bab85'
alert = fdl.AlertRule.get(id_=ALERT_RULE_ID)
Returns
Return Type | Description |
---|---|
AlertRule | Alert rule instance. |
Raises
Error code | Issue |
---|---|
NotFound | Alert rule with given identifier not found. |
Forbidden | Current user may not have permission to view details of alert rule. |
list()
Get a list of all alert rules in the organization.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
model_id | Optional[UUID] | None | Unique identifier for the model to which alert rule belongs. |
project_id | Optional[UUID] | None | Unique identifier for the project to which alert rule belongs |
metric_id | Union[str, UUID] | - | Type of alert metric UUID or enum. |
columns | Optional[List[str]] | None | List of column names on which alert rule is to be created. It can take ['ANY'] to check for all columns. |
baseline_id | Optional[UUID] | None | UUID of the baseline for the alert. |
ordering | Optional[List[str]] | None | List of Alert Rule fields to order by. Eg. [‘alert_time_bucket’] or [‘- alert_time_bucket’] for descending order. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
alerts = fdl.AlertRule.list(model_id=model.id)
Returns
Return Type | Description |
---|---|
Iterator[AlertRule] | Iterable of alert rule instances. |
delete()
Delete an existing alert rule.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique UUID of the alert rule . |
Usage
ALERT_NAME = "YOUR_ALERT_NAME"
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
alerts_list = fdl.AlertRule.list(model_id=model.id)
alert_rule = None
for alert in alerts_list:
if ALERT_NAME == alert.name:
alert_rule = alert
alert_rule.delete()
Returns
No
Raises
Error code | Issue |
---|---|
NotFound | Alert rule with given identifier not found. |
Forbidden | Current user may not have permission to view details of alert rule. |
enable_notifications()
Enable notification for an alert rule.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique UUID of the alert rule . |
Usage
ALERT_NAME = "YOUR_ALERT_NAME"
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
alerts_list = fdl.AlertRule.list(model_id=model.id)
alert_rule = None
for alert in alerts_list:
if ALERT_NAME == alert.name:
alert_rule = alert
alert_rule.enable_notifications()
Returns
None
Raises
Error code | Issue |
---|---|
NotFound | Alert rule with given identifier not found. |
Forbidden | Current user may not have permission to view details of alert rule. |
diable_notification()
Disable notification for an alert rule.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique UUID of the alert rule . |
Usage
ALERT_NAME = "YOUR_ALERT_NAME"
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
alerts_list = fdl.AlertRule.list(model_id=model.id)
alert_rule = None
for alert in alerts_list:
if ALERT_NAME == alert.name:
alert_rule = alert
alert_rule.diable_notification()
Returns
None
Raises
Error code | Issue |
---|---|
NotFound | Alert rule with given identifier not found. |
Forbidden | Current user may not have permission to view details of alert rule. |
AlertNotifications
Alert notifications for an alert rule.
Parameter | Type | Default | Description |
---|---|---|---|
emails | Optional[List[str]] | None | List of emails to send notification to. |
pagerduty_services | Optional[List[str]] | None | List of pagerduty services to trigger the alert to. |
pagerduty_severity | Optional[str] | None | Severity of pagerduty. |
webhooks | Optional[List[UUID]] | None | List of webhook UUIDs. |
set_notification_config()
Set notifications for an alert rule.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
emails | Optional[List[str]] | None | List of emails to send notification to. |
pagerduty_services | Optional[List[str]] | None | List of pagerduty services to trigger the alert to. |
pagerduty_severity | Optional[str] | None | Severity of pagerduty. |
webhooks | Optional[List[UUID]] | None | List of webhook UUIDs. |
Usage
ALERT_NAME = "YOUR_ALERT_NAME"
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
alerts_list = fdl.AlertRule.list(model_id=model.id)
alert_rule = None
for alert in alerts_list:
if ALERT_NAME == alert.name:
alert_rule = alert
notifications = alert_rule.set_notification_config(
emails=['[email protected]', '[email protected]'],
webhooks=[
'e20bf4cc-d2cf-4540-baef-d96913b14f1b',
'6e796fda-0111-4a72-82cd-f0f219e903e1',
],
)
Returns
Return Type | Description |
---|---|
AlertNotifications | Alert notifications for an alert rule. |
If we
pagerduty_severity
is passed withoutpagerduty_services
then thepagerduty_severity
is ignored.
Raises
Error code | Issue |
---|---|
BadRequest | All 4 input parameters are empty. |
ValueError | Webhook ID is incorrect. |
get_notification_config()
Get notification configuration for an alert rule.
Parameters
None
Usage
ALERT_NAME = "YOUR_ALERT_NAME"
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
alerts_list = fdl.AlertRule.list(model_id=model.id)
alert_rule = None
for alert in alerts_list:
if ALERT_NAME == alert.name:
alert_rule = alert
notification_config = alert_rule.get_notification_config()
Returns
Return Type | Description |
---|---|
Notification Config | Alert notifications for an alert rule. |
Raises
Error code | Issue |
---|---|
BadRequest | All 4 input parameters are empty. |
ValueError | Webhook ID is incorrect. |
TriggeredAlert
Alert records triggered for an alert rule.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the triggered alert rule. |
alert_rule_id | UUID | - | Unique identifier for the alert rule which needs to be triggered. |
alert_run_start_time | int | - | Timestamp of alert rule evaluation in epoch. |
alert_time_bucket | int | - | Timestamp pointing to the start of the time bucket in epoch. |
alert_value | float | - | Value of the metric for alert_time_bucket. |
baseline_time_bucket | Optional[int] | None | Timestamp pointing to the start of the baseline time bucket in epoch, only if alert rule is of 'time period' based comparison. |
baseline_value | Optional[float] | None | Value of the metric for baseline_time_bucket. |
is_alert | bool | - | Boolean to indicate if alert was supposed to be triggered. |
severity | str | - | Severity of alert represented by SeverityEnum, calculated based on value of metric and alert rule thresholds. |
failure_reason | str | - | String message if there was a failure sending notification. |
message | str | - | String message sent as a part of email notification. |
feature_name | Optional[str] | None | Name of feature for which alert was triggered. |
alert_record_main_version | int | - | Main version of triggered alert record in int, incremented when the value of severity changes. |
alert_record_sub_version | int | - | Sub version of triggered alert record in int, incremented when another alert with same severity as before is triggered. |
created_at | datetime | - | Time at which trigger alert rule was created. |
updated_at | datetime | - | Latest time at which trigger alert rule was updated. |
list()
List alert records triggered for an alert rule.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
alert_rule_id | UUID | - | Unique identifier for the alert rule which needs to be triggered. |
start_time | Optional[datetime] | None | Start time to filter trigger alerts in yyyy-MM-dd format, inclusive. |
end_time | Optional[datetime] | None | End time to filter trigger alerts in yyyy-MM-dd format, inclusive. |
ordering | Optional[List[str]] | None | List of Alert Rule fields to order by. Eg. [‘alert_time_bucket’] or [‘- alert_time_bucket’] for descending order. |
Usage
ALERT_NAME = "YOUR_ALERT_NAME"
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
alerts_list = fdl.AlertRule.list(model_id=model.id)
alert_rule = None
for alert in alerts_list:
if ALERT_NAME == alert.name:
alert_rule = alert
alert_records = fdl.AlertRecord.list(
alert_rule_id=alert_rule.id,
start_time=datetime(2023, 12, 18),
end_time=datetime(2023, 12, 25),
)
Returns
Return Type | Description |
---|---|
Iterator[AlertRecord] | Iterable of triggered alert rule instances for an alert rule. |
Baselines
Baseline datasets are used for making comparisons with production data.
A baseline dataset should be sampled from your model's training set, so it can serve as a representation of what the model expects to see in production.
Baseline
Baseline object contains the below fields.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the baseline. |
name | str | - | Baseline name. |
type_ | str | - | Type of baseline. Type can be static(Pre-production or production) or rolling(production). |
start_time | Optional[int] | None | Epoch to be used as start time for STATIC baseline. |
end_time | Optional[int] | None | Epoch to be used as end time for STATIC baseline. |
offset | Optional[int] | None | Offset in seconds relative to current time to be used for ROLLING baseline. |
window_size | Optional[int] | None | Span of window in seconds to be used for ROLLING baseline. |
row_count | Optional[int] | None | Number of rows in baseline. |
model | Model | - | Details of the model. |
project | Project | - | Details of the project to which the baseline belongs. |
dataset | Dataset | - | Details of the dataset from which baseline is derived. |
created_at | datetime | - | Time at which baseline was created. |
updated_at | datetime | - | Latest time at which baseline was updated. |
constructor()
Initialize a new baseline instance.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Unique name of the baseline. |
model_id | UUID | - | Unique identifier for the model to add baseline to. |
environment | EnvType | - | Type of environment. Can either be PRE_PRODUCTION or PRODUCTION. |
type_ | str | - | Type of Baseline. Type can be static(Pre-production or production) or rolling(production). |
dataset_id | Optional[UUID] | None | Unique identifier for the dataset on which the baseline is created. |
start_time | Optional[int] | None | Epoch to be used as start time for STATIC baseline. |
end_time | Optional[int] | None | Epoch to be used as end time for STATIC baseline. |
offset_delta | Optional[int] | None | Number of times of WindowBinSize to be used for ROLLING baseline. offset = offset_delta * window_bin_size |
window_bin_size | Optional[str] | None | Span of window in seconds to be used for ROLLING baseline using WindowBinSize |
Usage
BASELINE_NAME = 'YOUR_BASELINE_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
baseline = fdl.Baseline(
name=BASELINE_NAME,
model_id=model.id,
environment=fdl.EnvType.PRE_PRODUCTION,
dataset_id=dataset.id,
type_=fdl.BaselineType.STATIC,
)
create()
Adds a baseline to Fiddler.
Parameters
No
Usage
BASELINE_NAME = 'YOUR_BASELINE_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
baseline = fdl.Baseline(
name=BASELINE_NAME,
model_id=model.id,
environment=fdl.EnvType.PRE_PRODUCTION,
dataset_id=dataset.id,
type_=fdl.BaselineType.STATIC,
).create()
Returns
Return Type | Description |
---|---|
Baseline | Baseline instance. |
Raises
Error code | Issue |
---|---|
Conflict | Baseline with same name may exist in project . |
NotFound | Given dataset may not exist in for the input model. |
ValueError | Validation failures like wrong window size, start_time, end_time etc |
get()
Get baseline from Fiddler Platform based on UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique identifier for the baseline. |
Usage
BASELINE_ID = 'af05646f-0cef-4638-84c9-0d195df2575d'
baseline = fdl.Baseline.get(id_=BASELINE_ID)
Returns
Return Type | Description |
---|---|
Baseline | Baseline instance. |
Raises
Error code | Issue |
---|---|
NotFound | Baseline with given identifier not found. |
Forbidden | Current user may not have permission to view details of baseline. |
from_name()
Get baseline from Fiddler Platform based on name.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Name of the baseline. |
model_id | UUID | str | - | Unique identifier for the model. |
Usage
BASELINE_NAME = 'YOUR_BASELINE_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
baseline = fdl.Baseline.from_name(
name=BASELINE_NAME,
model_id=model.id
)
Returns
Return Type | Description |
---|---|
Baseline | Baseline instance. |
Raises
Error code | Issue |
---|---|
NotFound | Baseline with given identifier not found. |
Forbidden | Current user may not have permission to view details of baseline. |
list()
List all baselines accessible to user.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
model_id | UUID | - | UUID of the model associated with baseline. |
Usage
MODEL_ID = '4531bfd9-2ca2-4a7b-bb5a-136c8da09ca2'
baselines = fdl.Baseline.list(model_id=MODEL_ID)
Returns
Return Type | Description |
---|---|
Iterable[Baseline] | Iterable of all baseline objects. |
Raises
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to view details of baseline. |
delete()
Deletes a baseline.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique UUID of the baseline . |
Usage
BASELINE_NAME = 'YOUR_BASELINE_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
baseline = fdl.Baseline.from_name(name=BASELINE_NAME,model_id=model.id)
baseline.delete()
Returns
None
Raises
Error code | Issue |
---|---|
NotFound | Baseline with given identifier not found. |
Forbidden | Current user may not have permission to delete baseline. |
CustomMetrics
Customized metrics for your specific use case.
CustomMetric
CustomMetric object contains the below parameters.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the custom metric. |
name | str | - | Custom metric name. |
model_id | UUID | - | UUID of the model in which the custom metric is being added. |
definition | str | - | Definition of the custom metric. |
description | Optional[str] | None | Description of the custom metric. |
created_at | datetime | - | Time of creation of custom metric. |
constructor()
Initialise a new custom metric.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Custom metric name. |
model_id | UUID | - | UUID of the model in which the custom metric is being added. |
definition | str | - | Definition of the custom metric. |
description | Optional[str] | None | Description of the custom metric. |
Usage
METRIC_NAME = 'YOUR_CUSTOM_METRIC_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
metric = fdl.CustomMetric(
name=METRIC_NAME,
model_id=model.id,
definition="average(if(\"spend_amount\">1000, \"spend_amount\", 0))", #Use Fiddler Query Languge (FQL) to define your custom metrics
description='Get average spend for users spending over $1000',
)
get()
Get CustomMetric from Fiddler Platform based on model UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
model_id | UUID | - | UUID of the model associated with the custom metrics. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=model.id)
METRICS = fdl.CustomMetric.list(model_id=model.id)
Returns
Return Type | Description |
---|---|
Iterable[CustomMetric] | Iterable of all custom metric objects. |
Raises
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to view details of custom metric. |
from_name()
Get CustomMetric from Fiddler Platform based on name and model UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Name of the custom metric. |
model_id | UUID | str | - | Unique identifier for the model. |
Usage
METRIC_NAME = 'YOUR_CUSTOM_METRIC_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
metric = fdl.CustomMetric.from_name(
name=METRIC_NAME,
model_id=model.id
)
Returns
Return Type | Description |
---|---|
CustomMetric | Custom Metric instance. |
Raises
Error code | Issue |
---|---|
NotFound | Custom metric with given identifier not found. |
Forbidden | Current user may not have permission to view details of custom metric. |
create()
Creates a custom metric for a model on Fiddler Platform.
Parameters
None
Usage
METRIC_NAME = 'YOUR_CUSTOM_METRIC_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
METRIC = fdl.CustomMetric(
name=METRIC_NAME,
model_id=model.id,
definition="average(if(\"spend_amount\">1000, \"spend_amount\", 0))", #Use Fiddler Query Languge (FQL) to define your custom metrics
description='Get average spend for users spending over $1000',
).create()
Returns
Return Type | Description |
---|---|
CustomMetric | Custom Metric instance. |
Raises
Error code | Issue |
---|---|
Conflict | Custom metric with same name may exist in project . |
BadRequest | Invalid definition. |
NotFound | Given model may not exist. |
delete()
Delete a custom metric.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique UUID of the custom metric. |
Usage
METRIC_NAME = 'YOUR_CUSTOM_METRIC_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=model.id)
metric = fdl.CustomMetric.from_name(name=METRIC_NAME, model_id=model.id)
metric.delete()
Returns
No
Raises
Error code | Issue |
---|---|
NotFound | Custom metric with given identifier not found. |
Forbidden | Current user may not have permission to delete custom metric. |
Datasets
Datasets (or baseline datasets) are used for making comparisons with production data.
A baseline dataset should be sampled from your model's training set, so it can serve as a representation of what the model expects to see in production.
For more information, see Uploading a Baseline Dataset.
For guidance on how to design a baseline dataset, see Designing a Baseline Dataset.
Dataset
Dataset object contains the below parameters.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the dataset. |
name | str | - | Dataset name. |
row_count | Optional[int] | None | Number of rows in dataset. |
model | Model | - | Details of the model. |
project | Project | - | Details of the project to which the dataset belongs. |
organization | Organization | - | Details of the organization to which the dataset belongs. |
get()
Get dataset from Fiddler Platform based on UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique identifier for the dataset. |
Usage
DATASET_ID = 'ba6ec4e4-7188-44c5-ba84-c2cb22b4bb00'
dataset = fdl.Dataset.get(id_=DATASET_ID)
Returns
Return Type | Description |
---|---|
Dataset | Dataset instance. |
Raises
Error code | Issue |
---|---|
NotFound | Dataset with given identifier not found. |
Forbidden | Current user may not have permission to view details of dataset. |
from_name()
Get dataset from Fiddler Platform based on name and model UUID.
Usage params
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Name of the dataset. |
model_id | UUID | str | - | Unique identifier for the model. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
Returns
Return Type | Description |
---|---|
Dataset | Dataset instance. |
Raises
Error code | Issue |
---|---|
NotFound | Dataset not found in the given project name. |
Forbidden | Current user may not have permission to view details of dataset. |
list()
Get a list of all datasets associated to a model.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
model_id | UUID | - | UUID of the model associated with baseline. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
datasets = fdl.Dataset.list(model_id=model.id)
Returns
Return Type | Description |
---|---|
Iterable[Dataset] | Iterable of all dataset objects. |
Raises
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to view details of dataset. |
Jobs
Get job details.
Job
Job object contains the below fields.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the job. |
name | str | - | Name of the job. |
status | str | - | Current status of job. |
progress | float | - | Progress of job completion. |
info | dict | - | Dictionary containing resource_type, resource_name, project_name. |
error_message | Optional[str] | None | Message for job failure, if any. |
error_reason | Optional[str] | None | Reason for job failure, if any. |
extras | Optional[dict] | None | Metadata regarding the job. |
get()
Get the job instance using job UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique UUID of the project to which model is associated. |
verbose | bool | False | Flag to get extras metadata about the tasks executed. |
Usage
JOB_ID = '1531bfd9-2ca2-4a7b-bb5a-136c8da09ca1'
job = fdl.Job.get(id_=JOB_ID)
Returns
Return Type | Description |
---|---|
Job | Single job object for the input params. |
Raises
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to view details of job. |
wait()
Wait for job to complete either with success or failure status.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
interval | Optional[int] | 3 | Interval in seconds between polling for job status. |
timeout | Optional[int] | 1800 | Timeout in seconds for iterator to stop. |
Usage
JOB_ID = '1531bfd9-2ca2-4a7b-bb5a-136c8da09ca1'
job = fdl.Job.get(id_=JOB_ID)
job.wait()
Returns
Return Type | Description |
---|---|
Job | Single job object for the input params. |
Raises
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to view details of job. |
TimeoutError | When the default time out of 1800 secs. |
watch()
Watch job status at given interval and yield job object.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
interval | Optional[int] | 3 | Interval in seconds between polling for job status. |
timeout | Optional[int] | 1800 | Timeout in seconds for iterator to stop. |
Usage
JOB_ID = '1531bfd9-2ca2-4a7b-bb5a-136c8da09ca1'
job = fdl.Job.get(id_=JOB_ID)
job.watch()
Returns
Return Type | Description |
---|---|
Iterator[Job] | Iterator of job objects. |
Raises
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to view details of job. |
TimeoutError | When the default time out of 1800 secs. |
Models
A model is a representation of your machine learning model. Each model can be used for monitoring, explainability,
and fairness capabilities.
You do not need to upload your model artifact in order to onboard your model, but doing so will significantly
improve the quality of explanations generated by Fiddler.
Model
Model object contains the below parameters.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the model. |
name | str | - | Unique name of the model (only alphanumeric and underscores are allowed). |
input_type | ModelInputType | ModelInputType.TABULAR | Input data type used by the model. |
task | ModelTask | ModelTask.NOT_SET | Task the model is designed to address. |
task_params | ModelTaskParams | - | Task parameters given to a particular model. |
schema | ModelSchema | - | Model schema defines the details of each column. |
spec | ModelSpec | - | Model spec defines how model columns are used along with model task. |
description | str | - | Description of the model. |
event_id_col | str | - | Column containing event id. |
event_ts_col | str | - | Column containing event timestamp. |
event_ts_format | str | - | Event time stamp format. |
xai_params | XaiParams | - | Explainability parameters of the model. |
artifact_status | str | - | Artifact Status of the model. |
artifact_files | list[dict] | - | Dictionary containing file details of model artifact. |
is_binary_ranking_model | bool | - | True if model is ModelTask.RANKING and has only 2 target classes. |
created_at | datetime | - | Time at which model was created. |
updated_at | datetime | - | Latest time at which model was updated. |
created_by | User | - | Details of the who created the model. |
updated_by | User | - | Details of the who last updated the model. |
project | Project | - | Details of the project to which the model belongs. |
organization | Organization | - | Details of the organization to which the model belongs. |
constructor()
Initialize a new model instance.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Unique name of the model |
project_id | UUID | - | Unique identifier for the project to which model belongs. |
input_type | ModelInputType | ModelInputType.TABULAR | Input data type used by the model. |
task | ModelTask | ModelTask.NOT_SET | Task the model is designed to address. |
schema | ModelSchema | - | Model schema defines the details of each column. |
spec | ModelSpec | - | Model spec defines how model columns are used along with model task. |
task_params | ModelTaskParams | - | Task parameters given to a particular model. |
description | str | - | Description of the model. |
event_id_col | str | - | Column containing event id. |
event_ts_col | str | - | Column containing event timestamp. |
event_ts_format | str | - | Event time stamp format. |
xai_params | XaiParams | - | Explainability parameters of the model. |
from_data()
Build model instance from the given dataframe or file(csv/parquet).
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
source | pd.DataFrame | Path | str | - | Pandas dataframe or path to csv/parquet file |
name | str | - | Unique name of the model |
project_id | UUID | str | - | Unique identifier for the project to which model belongs. |
input_type | ModelInputType | ModelInputType.TABULAR | Input data type used by the model. |
task | ModelTask | ModelTask.NOT_SET | Task the model is designed to address. |
spec | ModelSpec | - | Model spec defines how model columns are used along with model task. |
task_params | ModelTaskParams | - | Task parameters given to a particular model. |
description | Optional[str] | - | Description of the model. |
event_id_col | Optional[str] | - | Column containing event id. |
event_ts_col | Optional[str] | - | Column containing event timestamp. |
event_ts_format | Optional[str] | - | Event time stamp format. |
xai_params | XaiParams | - | Explainability parameters of the model. |
max_cardinality | Optional[int] | None | Max cardinality to detect categorical columns. |
sample_size | Optional[int] | - | No. of samples to use for generating schema. |
Usage
MODEL_NAME = 'example_model'
PROJECT_ID = '1531bfd9-2ca2-4a7b-bb5a-136c8da09ca1'
MODEL_SPEC = {
'custom_features': [],
'decisions': ['Decisions'],
'inputs': [
'CreditScore',
'Geography',
],
'metadata': [],
'outputs': ['probability_churned'],
'schema_version': 1,
'targets': ['Churned'],
}
model = fdl.Model.from_data(
source=<file_path>,
name=MODEL_NAME,
project_id=PROJECT_ID,
spec=fdl.ModelSpec(**MODEL_SPEC),
)
Returns
Return Type | Description |
---|---|
Model | Model instance. |
Notes
-
from_data
will not create a model entry on Fiddler Platform.
Instead this method only returns a model instance which can be edited, call.create()
to onboard the model to
Fiddler Platform. -
spec
is optional tofrom_data
method. However, aspec
with at leastinputs
is required for model onboarding. -
Make sure
spec
is passed tofrom_data
method if model requires custom features. This method generates centroids
which are needed for custom feature drift computation.
create()
Onboard a new model to Fiddler Platform
Parameters
No
Usage
model = fdl.Model.from_data(...)
model.create()
Returns
Return Type | Description |
---|---|
Model | Model instance. |
Raises
Error code | Issue |
---|---|
Conflict | Model with same name may exist in project . |
get()
Get model from Fiddler Platform based on UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | str | - | Unique identifier for the model. |
Returns
Return Type | Description |
---|---|
Model | Model instance. |
Raises
Error code | Issue |
---|---|
NotFound | Model with given identifier not found. |
Forbidden | Current user may not have permission to view details of model. |
Usage
MODEL_ID = '4531bfd9-2ca2-4a7b-bb5a-136c8da09ca2'
model = fdl.Model.get(id_=MODEL_ID)
from_name()
Get model from Fiddler Platform based on name and project UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Name of the model. |
project_id | UUID | str | - | Unique identifier for the project. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
Returns
Return Type | Description |
---|---|
Model | Model instance. |
Raises
Error code | Issue |
---|---|
NotFound | Model not found in the given project name. |
Forbidden | Current user may not have permission to view details of model. |
list()
Gets all models of a project.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
project_id | Optional[UUID] | None | Unique UUID of the project to which model is associated. |
Returns
Return Type | Description |
---|---|
Iterable[Model Compact] | Iterable of model compact objects. |
Errors
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to the given project. |
Usage example
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
models = fdl.Model.list(project_id=project.id)
Notes
Since
Model
contains a lot of information, list operations does not return all the fields of a model.
Instead this method returnsModelCompact
objects on which.fetch()
can be called to get the completeModel
instance.
For most of the use-cases,ModelCompact
objects are sufficient.
update()
Update an existing model. Only following fields are allowed to be updated, backend will ignore if any other
field is updated on the instance.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
xai_params | Optional[XaiParams] | None | Explainability parameters of the model. |
description | Optional[str] | None | Description of the model. |
event_id_col | Optional[str] | None | Column containing event id. |
event_ts_col | Optional[str] | None | Column containing event timestamp. |
event_ts_format | Optional[str] | None | Event time stamp format. |
Usage
model.description = 'YOUR_MODEL_DESCRIPTION'
model.update()
Returns
No
Raises
Error code | Issue |
---|---|
BadRequest | If field is not updatable. |
delete()
Delete a model.
Parameters
No
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
job = model.delete()
job.wait()
Returns
Return Type | Description |
---|---|
Job | Async job details for the delete job. |
Notes
Model deletion is an async process, hence a job object is returned on
delete()
call.
Calljob.wait()
to wait for the job to complete. If you are planning to create a model with the same
name, please wait for the job to complete, otherwise backend will not allow new model with same name.
add_surrogate()
Add surrogate existing model.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
dataset_id | UUID | str | - | Dataset identifier |
deployment_params | Optional[DeploymentParams] | - | Model deployment parameters. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
DEPLOYMENT_PARAMS = {'memory': 1024, 'cpu': 1000}
model.add_surrogate(
dataset_id=dataset.id,
deployment_params=fdl.DeploymentParams(**DEPLOYMENT_PARAMS)
)
Returns
Return Type | Description |
---|---|
Job | Async job details for the add surrogate job. |
Raises
Error code | Issue |
---|---|
BadRequest | Invalid deployment parameter is passed |
update_surrogate()
Update surrogate existing model.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
dataset_id | UUID | str | - | Dataset identifier |
deployment_params | Optional[DeploymentParams] | None | Model deployment parameters. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
DEPLOYMENT_PARAMS = {'memory': 1024, 'cpu': 1000}
model.update_surrogate(
dataset_id=dataset.id,
deployment_params=fdl.DeploymentParams(**DEPLOYMENT_PARAMS)
)
Returns
Return Type | Description |
---|---|
Job | Async job details for the update surrogate job. |
add_artifact()
Add artifact files to existing model.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
model_dir | str | - | Path to directory containing artifacts for upload. |
deployment_params | Optional[DeploymentParams] | None | Model deployment parameters. |
Usage
MODEL_DIR = 'PATH_TO_MODEL_DIRECTORY'
DEPLOYMENT_PARAMS = {'memory': 1024, 'cpu': 1000}
model.add_artifact(
model_dir=MODEL_DIR,
deployment_params=fdl.DeploymentParams(**DEPLOYMENT_PARAMS)
)
Returns
Return Type | Description |
---|---|
Job | Async job details for the add artifact job. |
update_artifact()
Update existing artifact files in a model.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
model_dir | str | - | Path to directory containing artifacts for upload. |
deployment_params | Optional[DeploymentParams] | None | Model deployment parameters. |
Usage
MODEL_DIR = 'PATH_TO_MODEL_DIRECTORY'
DEPLOYMENT_PARAMS = {'memory': 1024, 'cpu': 1000}
model.update_artifact(
model_dir=MODEL_DIR,
deployment_params=fdl.DeploymentParams(**DEPLOYMENT_PARAMS)
)
Returns
Return Type | Description |
---|---|
Job | Async job details for the add artifact job. |
download_artifact()
Download existing artifact files in a model.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
output_dir | str | - | Path to directory to download the artifacts. |
Usage
OUTPUT_DIR = 'PATH_TO_TARGET_DIRECTORY'
model.download_artifact(output_dir=OUTPUT_DIR)
Returns
No
Properties
datasets
List all datasets associated with a model.
Parameters
No
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
model.datasets
Returns
Return Type | Description |
---|---|
Iterable[Dataset] | Iterable of dataset instances. |
Raises
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to view details of model. |
model_deployment
Get the model deployment object associated with the model.
Parameters
No
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
model.model_deployment
Returns
Return Type | Description |
---|---|
Model deployment | Model deployment instance. |
Raises
Error code | Issue |
---|---|
NotFound | Model with given identifier not found. |
Forbidden | Current user may not have permission to view details of model. |
publish()
Publish Pre-production or production events.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
source | Union[list[dict[str, Any]], str, Path, pd.DataFrame] | - | Source can be: 1. Path or str path: path for data file. 2. list[dict]: list of event dicts. max_len=1000. EnvType.PRE_PRODUCTION not supported. 3. dataframe: events dataframe. EnvType.PRE_PRODUCTION not supported. |
environment | EnvType | - | Either EnvType.PRE_PRODUCTION or EnvType.PRODUCTION |
dataset_name | Optional[str] | None | Name of the dataset. Not supported for EnvType.PRODUCTION |
Usage
Publish dataset (pre-production data) from file
# Publish File
FILE_PATH = 'PATH_TO_DATASET_FILE'
job = model.publish(
source=FILE_PATH,
environment=fdl.EnvType.PRE_PRODUCTION,
dataset_name='training_dataset'
)
job.wait()
Publish dataset (pre-production data) from dataframe
df = pd.DataFrame(np.random.randint(0, 100, size=(10, 4)), columns=list('ABCD'))
job = model.publish(
source=df,
environment=fdl.EnvType.PRE_PRODUCTION,
dataset_name='training_dataset'
)
job.wait()
Publish production events from list
# Publish list of events
events = [
{'A': 56, 'B': 68, 'C': 67, 'D': 27},
{'A': 43, 'B': 59, 'C': 64, 'D': 18},
]
event_ids = model.publish(
source=events,
environment=fdl.EnvType.PRODUCTION
)
List of events is internally published as a stream.
This method sends 1000 events at once to backend and gets the event_id for each of the events.
List of events is not a valid source for publishing pre-production data.
Publish production events from file
# Publish File
FILE_PATH = 'PATH_TO_DATASET_FILE'
job = model.publish(
source=FILE_PATH,
environment=fdl.EnvType.PRODUCTION,
)
job.wait()
Publish production events from dataframe
df = pd.DataFrame(np.random.randint(0, 100, size=(10, 4)), columns=list('ABCD'))
event_ids = model.publish(
source=df,
environment=fdl.EnvType.PRODUCTION,
)
Returns
In case of streaming publish
Return Type | Source | Description |
---|---|---|
list[UUID] | Union[list[dict], pd.DataFrame] | List of UUIDs. |
In case of batch publish
Return Type | Source | Description |
---|---|---|
Job | Union[str, Path] | Job object for file uploaded. |
Model Compact
Model object contains the below parameters.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the model. |
name | str | - | Unique name of the model |
fetch()
Fetch the model instance from Fiddler Platform.
Parameters
No
Returns
Return Type | Description |
---|---|
Model | Model instance. |
Raises
Error code | Issue |
---|---|
NotFound | Model not found for the given identifier |
Forbidden | Current user may not have permission to view details of model. |
Model deployment
Get model deployment object of a particular model.
Model deployment:
Model deployment object contains the below parameters.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the model. |
model | Model | - | Details of the model. |
project | Project | - | Details of the project to which the model belongs. |
organization | Organization | - | Details of the organization to which the model belongs. |
artifact_type | ArtifactType | - | Task the model is designed to address. |
deployment_type | DeploymentType | - | Type of deployment of the model. |
image_uri | Optional[str] | md-base/python/machine-learning:1.0.1 | Reference to the docker image to create a new runtime to serve the model. Check the available images on the Model Deployment page. |
active | bool | True | Status of the deployment. |
replicas | Optional[str] | 1 | The number of replicas running the model. Minimum value: 1 Maximum value: 10 Default value: 1 |
cpu | Optional[str] | 100 | The amount of CPU (milli cpus) reserved per replica. Minimum value: 10 Maximum value: 4000 (4vCPUs) Default value: 100 |
memory | Optional[str] | 256 | The amount of memory (mebibytes) reserved per replica. Minimum value: 150 Maximum value: 16384 (16GiB) Default value: 256 |
created_at | datetime | - | Time at which model deployment was created. |
updated_at | datetime | - | Latest time at which model deployment was updated. |
created_by | User | - | Details of the user who created the model deployment. |
updated_by | User | - | Details of the user who last updated the model deployment. |
Update model deployment
Update an existing model deployment.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
active | Optional[bool] | True | Status of the deployment. |
replicas | Optional[str] | 1 | The number of replicas running the model. Minimum value: 1 Maximum value: 10 Default value: 1 |
cpu | Optional[str] | 100 | The amount of CPU (milli cpus) reserved per replica. Minimum value: 10 Maximum value: 4000 (4vCPUs) Default value: 100 |
memory | Optional[str] | 256 | The amount of memory (mebibytes) reserved per replica. Minimum value: 150 Maximum value: 16384 (16GiB) Default value: 256 |
Usage
model_deployment.cpu = 300
model_deployment.active = True
model_deployment.update()
Returns
No
Raises
Error code | Issue |
---|---|
BadRequest | If field is not updatable. |
Organizations
Organization in which all the projects, models are present.
Organization:
Organization object contains the below parameters.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the organization. |
name | str | - | Unique name of the organization. |
created_at | datetime | - | Time at which organization was created. |
updated_at | datetime | - | Latest time at which organization was updated. |
Projects
Projects are used to organize your models and datasets. Each project can represent a machine learning task (e.g. predicting house prices, assessing creditworthiness, or detecting fraud).
A project can contain one or more models (e.g. lin_reg_house_predict, random_forest_house_predict).
Project
Project object contains the below parameters.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | None | Unique identifier for the project. |
name | str | None | Unique name of the project. |
created_at | datetime | None | Time at which project was created. |
updated_at | datetime | None | Latest time at which project was updated. |
created_by | User | None | Details of the who created the project. |
updated_by | User | None | Details of the who last updated the project. |
organization | Organization | None | Details of the organization to which the project belongs. |
create()
Creates a project using the specified name.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | None | Unique name of the project. |
Usage
PROJECT_NAME = 'bank_churn'
project = fdl.Project(name=PROJECT_NAME)
project.create()
Returns
Return Type | Description |
---|---|
Project | Project instance. |
Raises
Error code | Issue |
---|---|
Conflict | Project with same name may exist. |
get()
Get project from Fiddler Platform based on UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | None | Unique identifier for the project. |
Usage
PROJECT_ID = '1531bfd9-2ca2-4a7b-bb5a-136c8da09ca1'
project = fdl.Project.get(id_=PROJECT_ID)
Returns
Return Type | Description |
---|---|
Project | Project instance. |
Raises
Error code | Issue |
---|---|
NotFound | Project with given identifier not found. |
Forbidden | Current user may not have permission to view details of project. |
from_name()
Get project from Fiddler Platform based on name.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
project_name | str | None | Name of the project. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
Returns
Return Type | Description |
---|---|
Project | Project instance. |
Raises
Error code | Issue |
---|---|
NotFound | Project not found in the given project name. |
Forbidden | Current user may not have permission to view details of project. |
list()
Gets all projects in an organization.
Parameters
No
Returns
Return Type | Description |
---|---|
Iterable[Project ] | Iterable of project objects. |
Errors
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to the given project. |
Usage example
projects = fdl.Project.list()
delete()
Delete a project.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | None | Unique UUID of the project . |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
project.delete()
Returns
None
Properties
List models()
List all models associated with a project.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | None | Unique UUID of the project . |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
project.models
Returns
Return Type | Description |
---|---|
Iterable[Model] | Iterable of model objects. |
Raises
Error code | Issue |
---|---|
NotFound | Project with given identifier not found. |
Forbidden | Current user may not have permission to view details of project. |
Segments
Fiddler offers the ability to segment your data based on a custom condition.
Segment object()
Segment object contains the below parameters.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the segment. |
name | str | - | Segment name. |
model_id | UUID | - | UUID of the model to which segment belongs. |
definition | str | - | Definition of the segment. |
description | Optional[str] | None | Description of the segment. |
created_at | datetime | - | Time of creation of segment. |
constructor()
Initialise a new segment.
Usage params
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Segment name. |
model_id | UUID | - | UUID of the model to which segment belongs. |
definition | str | - | Definition of the segment. |
description | Optional[str] | None | Description of the segment. |
Usage
SEGMENT_NAME = 'YOUR_SEGMENT_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
segment = fdl.Segment(
name=SEGMENT_NAME,
model_id=model.id,
definition="Age < 60", #Use Fiddler Query Languge (FQL) to define your custom segments
description='Users with Age under 60',
)
get()
Get segment from Fiddler Platform based on UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique identifier for the segment. |
Usage
SEGMENT_ID = 'ba6ec4e4-7188-44c5-ba84-c2cb22b4bb00'
segment = fdl.Segment.get(id_= SEGMENT_ID)
Returns
Return Type | Description |
---|---|
Segment | Segment instance. |
Raises
Error code | Issue |
---|---|
NotFound | Segment with given identifier not found. |
Forbidden | Current user may not have permission to view details of segment. |
from_name()
Get segment from Fiddler Platform based on name and model UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Name of the segment. |
model_id | UUID | str | - | Unique identifier for the model. |
Usage
SEGMENT_NAME = 'YOUR_SEGMENT_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
segment = fdl.Segment.from_name(
name=SEGMENT_NAME,
model_id=model.id
)
Returns
Return Type | Description |
---|---|
Segment | Segment instance. |
Raises
Error code | Issue |
---|---|
NotFound | Segment with given identifier not found. |
Forbidden | Current user may not have permission to view details of segment. |
list()
List all segments in the given model.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
model_id | UUID | - | UUID of the model associated with the segment. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
segment = fdl.Segment.list(model_id=model.id)
Returns
Return Type | Description |
---|---|
Iterable[Segment] | Iterable of all segment objects. |
Raises
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to view details of segment. |
create()
Adds a segment to a model.
Parameters
No
Usage
SEGMENT_NAME = 'YOUR_SEGMENT_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
segment = fdl.Segment(
name=SEGMENT_NAME,
model_id=model.id,
definition="Age < 60", #Use Fiddler Query Languge (FQL) to define your custom segments
description='Users with Age under 60',
).create()
Returns
Return Type | Description |
---|---|
Segment | Segment instance. |
Raises
Error code | Issue |
---|---|
Conflict | Segment with same name may exist for the model. |
BadRequest | Invalid definition. |
NotFound | Given model may not exist . |
delete()
Delete a segment.
Parameters
No
Usage
SEGMENT_NAME = 'YOUR_SEGMENT_NAME'
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
segment = fdl.Segment.from_name(name=SEGMENT_NAME,model_id=model.id)
segment.delete()
Returns
No
Raises
Error code | Issue |
---|---|
NotFound | Segment with given identifier not found. |
Forbidden | Current user may not have permission to delete segment. |
Webhooks
Webhooks integration for alerts to be posted on Slack or other apps.
Webhook()
Webhook object contains the below parameters.
Parameter | Type | Default | Description |
---|---|---|---|
id | UUID | - | Unique identifier for the webhook. |
name | str | - | Unique name of the webhook. |
url | str | - | Webhook integration URL. |
provider | WebhookProvider | - | App in which the webhook needs to be integrated. Either 'SLACK' or 'OTHER' |
created_at | datetime | - | Time at which webhook was created. |
updated_at | datetime | - | Latest time at which webhook was updated. |
constructor()
Initialise a new webhook.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Unique name of the webhook. |
url | str | - | Webhook integration URL. |
provider | WebhookProvider | - | App in which the webhook needs to be integrated. |
Usage
WEBHOOK_NAME = 'test_webhook_config_name'
WEBHOOK_URL = 'https://www.slack.com'
WEBHOOK_PROVIDER = 'SLACK'
webhook = fdl.Webhook(
name=WEBHOOK_NAME, url=WEBHOOK_URL, provider=WEBHOOK_PROVIDER
)
get()
Gets all details of a particular webhook from UUID.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique identifier for the webhook. |
Usage
WEBHOOK_ID = 'a5b654eb-15c8-43c8-9d50-9ba6eea9a0ff'
webhook = fdl.Webhook.get(id_=WEBHOOK_ID)
Returns
Return Type | Description |
---|---|
Webhook | Webhook instance. |
Raises
Error code | Issue |
---|---|
NotFound | Webhook with given identifier not found. |
Forbidden | Current user may not have permission to view details of webhook. |
from_name()
Get Webhook from Fiddler Platform based on name.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Name of the webhook. |
Usage
WEBHOOK_NAME = 'YOUR_WEBHOOK_NAME'
webhook = fdl.Webhook.from_name(
name=WEBHOOK_NAME
)
Returns
Return Type | Description |
---|---|
Webhook | Webhook instance. |
Raises
Error code | Issue |
---|---|
NotFound | Webhook with given name not found. |
Forbidden | Current user may not have permission to view details of webhook. |
list()
Gets all webhooks accessible to a user.
Parameters
No
Usage
WEBHOOKS = fdl.Webhook.list()
Returns
Return Type | Description |
---|---|
Iterable[Webhook] | Iterable of webhook objects. |
Raises
Error code | Issue |
---|---|
Forbidden | Current user may not have permission to view details of webhook. |
create()
Create a new webhook.
Parameters
No
Usage
WEBHOOK_NAME = 'YOUR_WEBHOOK_NAME'
WEBHOOK_URL = 'https://www.slack.com'
WEBHOOK_PROVIDER = 'SLACK'
webhook = fdl.Webhook(
name=WEBHOOK_NAME, url=WEBHOOK_URL, provider=WEBHOOK_PROVIDER
)
webhook.create()
Returns
Return Type | Description |
---|---|
Webhook | Webhook object. |
update()
Update an existing webhook.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
name | str | - | Unique name of the webhook. |
url | str | - | Webhook integration URL. |
provider | WebhookProvider | - | App in which the webhook needs to be integrated. |
Usage
WEBHOOK_NAME = "YOUR_WEBHOOK_NAME"
webhook_list = fdl.Webhook.list()
webhook_instance = None
for webhook in webhook_list:
if WEBHOOK_NAME == webhook.name:
webhook_instance = webhook
webhook_instance.name = 'NEW_WEBHOOK_NAME'
webhook_instance.update()
Returns
None
Raises
Error code | Issue |
---|---|
BadRequest | If field is not updatable. |
delete()
Delete a webhook.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
id_ | UUID | - | Unique UUID of the webhook. |
Usage
WEBHOOK_NAME = "YOUR_WEBHOOK_NAME"
WEBHOOK_LIST = fdl.Webhook.list()
for WEBHOOK in WEBHOOK_LIST:
if WEBHOOK_NAME == WEBHOOK.name:
WEBHOOK_ID = WEBHOOK.id
webhook = fdl.Webhook.get(id_=WEBHOOK_ID)
webhook.delete()
Returns
None
Explainability
Explainability methods for models.
precompute_feature_importance
Pre-compute feature importance for a model on a dataset. This is used in various places in the UI.
A single feature importance can be precomputed (computed and cached) for a model.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
dataset_id | UUID | - | The unique identifier of the dataset. |
num_samples | Optional[int] | None | The number of samples used. |
num_iterations | Optional[int] | None | The maximum number of ablated model inferences per feature. |
num_refs | Optional[int] | None | The number of reference points used in the explanation. |
ci_level | Optional[float] | None | The confidence level (between 0 and 1). |
update | Optional[bool] | False | Flag to indicate whether the precomputed feature importance should be recomputed and updated. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
job = model.precompute_feature_importance(dataset_id=dataset.id, update=False)
Returns
Return Type | Description |
---|---|
Job | Async job details for the pre-compute job . |
get_precomputed_feature_importance
Get pre-computed global feature importance for a model over a dataset or a slice.
Parameters
No
Usage
feature_importance = model.get_precomputed_feature_importance()
Returns
Return Type | Description |
---|---|
Tuple | A named tuple with the feature importance results . |
get_feature_importance()
Get global feature importance for a model over a dataset or a slice.
Usage params
Parameter | Type | Default | Description |
---|---|---|---|
data_source | Union[DatasetDataSource, SqlSliceQueryDataSource] | - | DataSource for the input dataset to compute feature importance on (DatasetDataSource or SqlSliceQueryDataSource). |
num_iterations | Optional[int] | None | The maximum number of ablated model inferences per feature. |
num_refs | Optional[int] | None | The number of reference points used in the explanation. |
ci_level | Optional[float] | None | The confidence level (between 0 and 1). |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
# Dataset data source
feature_importance = model.get_feature_importance(
data_source=fdl.DatasetDataSource(
env_type='PRE-PRODUCTION',
env_id=dataset.id,
),
)
# Slice Query data source
feature_importance = model.get_feature_importance(
data_source=fdl.SqlSliceQueryDataSource(
query=query,
num_samples=200
),
)
Returns
Return Type | Description |
---|---|
Tuple | A named tuple with the feature importance results . |
Raises
Error code | Issue |
---|---|
BadRequest | If dataset id is not specified. |
precompute_feature_impact()
Pre-compute feature impact for a model on a dataset. This is used in various places in the UI.
A single feature impact can be precomputed (computed and cached) for a model.
Usage params
Parameter | Type | Default | Description |
---|---|---|---|
dataset_id | UUID | - | The unique identifier of the dataset. |
num_samples | Optional[int] | None | The number of samples used. |
num_iterations | Optional[int] | None | The maximum number of ablated model inferences per feature. |
num_refs | Optional[int] | None | The number of reference points used in the explanation. |
ci_level | Optional[float] | None | The confidence level (between 0 and 1). |
min_support | Optional[int] | 15 | Only used for NLP (TEXT inputs) models. Specify a minimum support (number of times a specific word was present in the sample data) to retrieve top words. Default to 15. |
update | Optional[bool] | False | Flag to indicate whether the precomputed feature impact should be recomputed and updated. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
job = model.precompute_feature_impact(dataset_id=dataset.id, update=False)
Returns
Return Type | Description |
---|---|
Job | Async job details for the pre-compute job . |
get_precomputed_feature_impact()
Get pre-computed global feature impact for a model over a dataset or a slice.
Parameters
No
Usage
feature_impact = model.get_precomputed_feature_impact()
Returns
Return Type | Description |
---|---|
Tuple | A named tuple with the feature impact results . |
get_feature_impact()
Get global feature impact for a model over a dataset or a slice.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
data_source | Union[DatasetDataSource, SqlSliceQueryDataSource] | - | DataSource for the input dataset to compute feature importance on (DatasetDataSource or SqlSliceQueryDataSource). |
num_iterations | Optional[int] | None | The maximum number of ablated model inferences per feature. |
num_refs | Optional[int] | None | The number of reference points used in the explanation. |
ci_level | Optional[float] | None | The confidence level (between 0 and 1). |
min_support | Optional[int] | 15 | Only used for NLP (TEXT inputs) models. Specify a minimum support (number of times a specific word was present in the sample data)to retrieve top words. Default to 15. |
output_columns | Optional[list[str]] | None | Only used for NLP (TEXT inputs) models. Output column names to compute feature impact on. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
# Dataset data source
feature_impact = model.get_feature_impact(
data_source=fdl.DatasetDataSource(
env_type='PRE-PRODUCTION',
env_id=dataset.id,
),
)
# Slice Query data source
query = f'SELECT * FROM {dataset.id}.{model.id}'
feature_impact = model.get_feature_impact(
data_source=fdl.SqlSliceQueryDataSource(
query=query,
num_samples=200),
)
Returns
Return Type | Description |
---|---|
Tuple | A named tuple with the feature impact results . |
Raises
Error code | Issue |
---|---|
BadRequest | If dataset id is not specified or query is not valid. |
precompute_predictions()
Pre-compute predictions for a model on a dataset.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
dataset_id | UUID | - | Unique identifier of the dataset used for prediction. |
chunk_size | Optional[int] | None | Chunk size for fetching predictions. |
update | Optional[bool] | False | Flag to indicate whether the pre-computed predictions should be re-computed and updated for this dataset. |
Usage
PROJECT_NAME = 'YOUR_PROJECT_NAME'
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
project = fdl.Project.from_name(name=PROJECT_NAME)
model = fdl.Model.from_name(name=MODEL_NAME, project_id=project.id)
dataset = fdl.Dataset.from_name(name=DATASET_NAME, model_id=model.id)
model.precompute_predictions(dataset_id=dataset.id, update=False)
Returns
Return Type | Description |
---|---|
Job | Async job details for the prediction job . |
explain()
Get explanation for a single observation.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
input_data_source | Union[RowDataSource, EventIdDataSource] | - | DataSource for the input data to compute explanation on (RowDataSource, EventIdDataSource). |
ref_data_source | Optional[Union[DatasetDataSource, SqlSliceQueryDataSource]] | None | DataSource for the reference data to compute explanationon (DatasetDataSource, SqlSliceQueryDataSource). Only used for non-text models and the following methods: 'SHAP', 'FIDDLER_SHAP', 'PERMUTE', 'MEAN_RESET'. |
method | Optional[Union[ExplainMethod, str]] | ExplainMethod.FIDDLER_SHAP | Explanation method name. Could be your custom explanation method or one of the following method: 'SHAP', 'FIDDLER_SHAP', 'IG', 'PERMUTE', 'MEAN_RESET', 'ZERO_RESET'. |
num_permutations | Optional[int] | None | For Fiddler SHAP, that corresponds to the number of coalitions to sample to estimate the Shapley values of each single-reference game. For the permutation algorithms, this corresponds to the number of permutations from the dataset to use for the computation. |
ci_level | Optional[float] | None | The confidence level (between 0 and 1) to use for the confidence intervals in Fiddler SHAP. Not used for other methods. |
top_n_class | Optional[int] | None | For multiclass classification models only, specifying if only the n top classes are computed or all classes (when parameter is None). |
Usage
# RowDataSource and
explain_result = model.explain(
input_data_source=fdl.RowDataSource(
row={
'CreditScore': 619,
'Geography': 'France',
'Gender': 'Female',
'Age': 42,
'Tenure': 2,
'Balance': 0.0,
'NumOfProducts': 1,
'HasCrCard': 'Yes',
'IsActiveMember': 'Yes',
'EstimatedSalary': 101348.88,
},
),
ref_data_source=fdl.DatasetDataSource(
env_type='PRODUCTION',
),
)
# EventIdDataSource
explain_result = model.explain(
input_data_source=fdl.EventIdDataSource(
event_id='5531bfd9-2ca2-4a7b-bb5a-136c8da09ca0',
env_type=fdl.EnvType.PRE_PRODUCTION
),
ref_data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=100)
)
Return params
Return Type | Description |
---|---|
Tuple | A named tuple with the explanation results. |
Raises
Error code | Issue |
---|---|
NotSupported | If specified source type is not supported. |
get_fairness()
Get fairness analysis on a dataset or a slice.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
data_source | Union[DatasetDataSource, SqlSliceQueryDataSource] | - | DataSource for the input dataset to compute fairness on (DatasetDataSource or SqlSliceQueryDataSource). |
protected_features | list[str] | - | List of protected attribute names to compute fairness analysis on. |
positive_outcome | Union[str, int, float, bool] | - | Name of the positive outcome to compute fairness analysis on. |
score_threshold | Optional[float] | 0.5 | Binary threshold value (between 0 and 1). Default to 0.5. |
Usage
# For DatasetDataSource
fairness = model.get_fairness(
data_source=fdl.DatasetDataSource(
env_type='PRODUCTION',
),
protected_features=['Gender'],
positive_outcome='Churned',
)
# For SqlSliceQueryDataSource
fairness = model.get_fairness(
data_source=fdl.SqlSliceQueryDataSource(
query=query,
num_samples=200
),
protected_features=['Gender'],
positive_outcome='Churned',
)
Returns
Return Type | Description |
---|---|
Tuple | A named tuple with the fairness results . |
Errors
Error code | Issue |
---|---|
NotSupported | If specified datasource is not supported. |
get_slice()
Fetch data with slice query. 1M rows is the max size that can be fetched.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
query | str | - | An SQL query that begins with the keyword 'SELECT'. |
sample | Optional[bool] | False | Allows caller to explicitly specify list of columns to select overriding columns selected in the query. |
max_rows | Optional[int] | None | Number of maximum rows to fetch. |
columns | Optional[list[str]] | None | Whether rows should be sample or not from the database. |
Usage
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
model.get_slice(
query=f"SELECT * FROM {DATASET_NAME}.{MODEL_NAME} WHERE geography='France' order by balance desc LIMIT 3",
columns=['Age'],
max_rows=2,
)
Returns
Return Type | Description |
---|---|
Dataframe | A pandas DataFrame containing the slice returned by the query. |
Raises
Error code | Issue |
---|---|
BadRequest | If given query is wrong. |
Info
Only read-only SQL operations are supported. Certain SQL operations like aggregations and joins might not result in a valid slice.
download_slice()
Download data with slice query to parquet file. 10M rows is the max size that can be downloaded.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
output_dir | Union[Path, str] | - | Path to download the file. |
query | str | - | An SQL query that begins with the keyword 'SELECT'. |
sample | Optional[bool] | False | Allows caller to explicitly specify list of columns to select overriding columns selected in the query. |
max_rows | Optional[int] | None | Number of maximum rows to fetch. |
columns | Optional[list[str]] | None | Whether rows should be sample or not from the database. |
Usage
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
model.download_slice(
output_dir=parquet_output,
query=f'SELECT * FROM {DATASET_NAME}.{MODEL_NAME} WHERE age>=20 order by balance desc',
columns=['Age'],
)
Returns
Parquet file with slice query contents downloaded to the Path mentioned in output_dir.
Raises
Error code | Issue |
---|---|
BadRequest | If given query is wrong. |
predict()
Run model on an input dataframe.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
df | pd.DataFrame | None | Feature dataframe. |
chunk_size | Optional[int] | None | Chunk size for fetching predictions. |
Usage
data = {
'row_id': 1109,
'fixed acidity': 10.8,
'volatile acidity': 0.47,
'citric acid': 0.43,
'residual sugar': 2.1,
'chlorides': 0.171,
'free sulfur dioxide': 27.0,
'total sulfur dioxide': 66.0,
'density': 0.9982,
'pH': 3.17,
'sulphates': 0.76,
'alcohol': 10.8,
}
df = pd.DataFrame(data, index=data.keys())
predictions = model.predict(df=df)
Returns
Return Type | Description |
---|---|
Dataframe | A pandas DataFrame of the predictions. |
get_mutual_info()
Get mutual information.
The Mutual information measures the dependency between two random variables. It's a non-negative value. If two random variables are independent MI is equal to zero. Higher MI values means higher dependency.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
query | str | - | Slice query to compute Mutual information on. |
column_name | str | - | Column name to compute mutual information with respect to all the variables in the dataset. |
num_samples | Optional[int] | None | Number of samples to select for computation. |
normalized | Optional[bool] | False | If set to True, it will compute Normalized Mutual Information. |
Usage
MODEL_NAME = 'YOUR_MODEL_NAME'
DATASET_NAME = 'YOUR_DATASET_NAME'
model.get_mutual_info(
query=f'select * from {DATASET_NAME}.{MODEL_NAME}',
column_name='Geography',
)
Returns
Return Type | Description |
---|---|
Dictionary | Contains mutual information w.r.t the given feature for each column given. |
Raises
Error code | Issue |
---|---|
BadRequest | If given query is wrong. |
Constants
ModelInputType
Input data type used by the model.
Enum Value | Description |
---|---|
ModelInputType.TABULAR | For tabular models. |
ModelInputType.TEXT | For text models. |
ModelInputType.MIXED | For models which can be a mixture of text and tabular. |
ModelTask
The model’s algorithm type.
Enum Value | Description |
---|---|
ModelTask.REGRESSION | For regression models. |
ModelTask.BINARY_CLASSIFICATION | For binary classification models. |
ModelTask.MULTICLASS_CLASSIFICATION | For multiclass classification models. |
ModelTask.RANKING | For ranking classification models. |
ModelTask.LLM | For LLM models. |
ModelTask.NOT_SET | For other model tasks or no model task specified. |
DataType
The available data types when defining a model Column.
Enum Value | Description |
---|---|
DataType.FLOAT | For floats. |
DataType.INTEGER | For integers. |
DataType.BOOLEAN | For booleans. |
DataType.STRING | For strings. |
DataType.CATEGORY | For categorical types. |
DataType.TIMESTAMP | For timestamps. |
DataType.VECTOR | For vector types |
CustomFeatureType
This is an enumeration defining the types of custom features that can be created.
Enum | Value |
---|---|
CustomFeatureType.FROM_COLUMNS | Represents custom features derived directly from columns. |
CustomFeatureType.FROM_VECTOR | Represents custom features derived from a vector column. |
CustomFeatureType.FROM_TEXT_EMBEDDING | Represents custom features derived from text embeddings. |
CustomFeatureType.FROM_IMAGE_EMBEDDING | Represents custom features derived from image embeddings. |
CustomFeatureType.ENRICHMENT | Represents custom features derived from an enrichment. |
ArtifactType
Indicator of type of a model artifact.
Enum Value | Description |
---|---|
ArtifactType.SURROGATE | For surrogates. |
ArtifactType.PYTHON_PACKAGE | For python package. |
DeploymentType
Indicator of how the model was deployed.
Enum Value | Description |
---|---|
DeploymentType.BASE_CONTAINER | For base containers. |
DeploymentType.MANUAL | For manual deployment. |
EnvType
Environment type of a dataset.
Enum Value | Description |
---|---|
EnvType.PRODUCTION | For production events. |
EnvType.PRE_PRODUCTION | For pre production events. |
BaselineType
Type of a baseline.
Enum Value | Description |
---|---|
BaselineType.STATIC | For static production baseline. |
BaselineType.ROLLING | For rolling production baseline. |
WindowBinSize
Window for rolling baselines.
Enum Value | Description |
---|---|
WindowBinSize.HOUR | For rolling window to be 1 hour. |
WindowBinSize.DAY | For rolling window to be 1 day. |
WindowBinSize.WEEK | For rolling window to be 1 week. |
WindowBinSize.MONTH | For rolling window to be 1 month. |
WebhookProvider
Specifies the integration provider or OTHER for generic callback response.
Enum Value | Description |
---|---|
WebhookProvider.SLACK | For slack. |
WebhookProvider.OTHER | For any other app. |
Severity
Severity level for alerts.
Enum Value | Description |
---|---|
Severity.DEFAULT | For alert rule when none of the thresholds have passed. |
Severity.WARNING | For alert rule when alert crossed the warning_threshold but not the critical_threshold. |
Severity.CRITICAL | For alert rule when alert crossed the critical_raw_threshold. |
Schemas
Column
A model column representation.
Parameter | Type | Default | Description |
---|---|---|---|
name | str | None | Column name provided by the customer. |
data_type | list[Datatype] | None | List of columns. |
min | Union[int, float] | None | Min value of integer/float column. |
max | Union[int, float] | None | Max value of integer/float column. |
categories | list | None | List of unique values of a categorical column. |
bins | list[Union[int, float]] | None | Bins of integer/float column. |
replace_with_nulls | list | None | Replace the list of given values to NULL if found in the events data. |
n_dimensions | int | None | Number of dimensions of a vector column. |
fdl.Enrichment (beta)
Input Parameter | Type | Default | Description |
---|---|---|---|
name | str | The name of the custom feature to generate | |
enrichment | str | The enrichment operation to be applied | |
columns | List[str] | The column names on which the enrichment depends | |
config | Optional[List] | {} | (optional): Configuration specific to an enrichment operation which controls the behavior of the enrichment |
fiddler_custom_features = [
fdl.Enrichment(
name='question_embedding',
enrichment='embedding',
columns=['question'],
),
fdl.TextEmbedding(
name='question_cf',
source_column='question',
column='question_embedding',
),
]
model_spec = fdl.ModelSpec(
inputs=['question'],
custom_features=fiddler_custom_features,
)
Note
Enrichments are disabled by default. To enable them, contact your administrator. Failing to do so will result in an error during the add_model
call.
Embedding (beta)
Supported Models:
model_name | size | Type | pooling_method | Notes |
---|---|---|---|---|
BAAI/bge-small-en-v1.5 | small | Sentence Transformer | ||
sentence-transformers/all-MiniLM-L6-v2 | med | Sentence Transformer | ||
thenlper/gte-base | med | Sentence Transformer | (default) | |
gpt2 | med | Encoder NLP Transformer | last_token | |
distilgpt2 | small | Encoder NLP Transformer | last_token | |
EleuteherAI/gpt-neo-125m | med | Encoder NLP Transformer | last_token | |
google/bert_uncased_L-4_H-256_A-4 | small | Decoder NLP Transformer | first_token | Smallest Bert |
bert-base-cased | med | Decoder NLP Transformer | first_token | |
distilroberta-base | med | Decoder NLP Transformer | first_token | |
xlm-roberta-large | large | Decoder NLP Transformer | first_token | Multilingual |
roberta-large | large | Decoder NLP Transformer | first_token |
fiddler_custom_features = [
fdl.Enrichment(
name='Question Embedding', # name of the enrichment, will be the vector col
enrichment='embedding',
columns=['question'], # only one allowed per embedding enrichment, must be a text column in dataframe
config={ # optional
'model_name': ... # default: 'thenlper/gte-base'
'pooling_method': ... # choose from '{first/last/mean}_token'. Only required if NOT using a sentence transformer
}
),
fdl.TextEmbedding(
name='question_cf', # name of the text embedding custom feature
source_column='question', # source - raw text
column='Question Embedding', # the name of the vector - outpiut of the embedding enrichment
),
]
model_spec = fdl.ModelSpec(
inputs=['question'],
custom_features=fiddler_custom_features,
)
The above example will lead to generation of new column:
Column | Type | Description |
---|---|---|
FDL Question Embedding | vector | Embeddings corresponding to string column question . |
Note
In the context of Hugging Face models, particularly transformer-based models used for generating embeddings, the pooling_method determines how the model processes the output of its layers to produce a single vector representation for input sequences (like sentences or documents). This is crucial when using these models for tasks like sentence or document embedding, where you need a fixed-size vector representation regardless of the input length.
Centroid Distance (beta)
fiddler_custom_features = [
fdl.Enrichment(
name='question_embedding',
enrichment='embedding',
columns=['question'],
),
fdl.TextEmbedding(
name='question_cf',
source_column='question',
column='question_embedding',
),
fdl.Enrichment(
name='Centroid Distance',
enrichment='centroid_distance',
columns=['question_cf'],
),
]
model_spec = fdl.ModelSpec(
inputs=['question'],
custom_features=fiddler_custom_features,
)
The above example will lead to generation of new column:
Column | Type | Description |
---|---|---|
FDL Centroid Distance (question_embedding) | float | Distance from the nearest K-Means centroid present inquestion_embedding . |
Note
Does not calculate membership for preproduction data, so you cannot calculate drift.
Personally Identifiable Information (beta)
List of PII entities
Entity Type | Description | Detection Method | Example |
---|---|---|---|
CREDIT_CARD | A credit card number is between 12 to 19 digits. https://en.wikipedia.org/wiki/Payment_card_number | Pattern match and checksum | 4111111111111111 378282246310005 (American Express) |
CRYPTO | A Crypto wallet number. Currently only Bitcoin address is supported | Pattern match, context and checksum | 1BoatSLRHtKNngkdXEeobR76b53LETtpyT |
DATE_TIME | Absolute or relative dates or periods or times smaller than a day. | Pattern match and context | 01/01/2024 |
EMAIL_ADDRESS | An email address identifies an email box to which email messages are delivered | Pattern match, context and RFC-822 validation | [email protected] |
IBAN_CODE | The International Bank Account Number (IBAN) is an internationally agreed system of identifying bank accounts across national borders to facilitate the communication and processing of cross border transactions with a reduced risk of transcription errors. | Pattern match, context and checksum | DE89 3704 0044 0532 0130 00 |
IP_ADDRESS | An Internet Protocol (IP) address (either IPv4 or IPv6). | Pattern match, context and checksum | 1.2.3.4 127.0.0.12/16 1234:BEEF:3333:4444:5555:6666:7777:8888 |
LOCATION | Name of politically or geographically defined location (cities, provinces, countries, international regions, bodies of water, mountains | Custom logic and context | PALO ALTO Japan |
PERSON | A full person name, which can include first names, middle names or initials, and last names. | Custom logic and context | Joanna Doe |
PHONE_NUMBER | A telephone number | Custom logic, pattern match and context | 5556667890 |
URL | A URL (Uniform Resource Locator), unique identifier used to locate a resource on the Internet | Pattern match, context and top level url validation | www.fiddler.ai |
US SSN | A US Social Security Number (SSN) with 9 digits. | Pattern match and context | 1234-00-5678 |
US_DRIVER_LICENSE | A US driver license according to https://ntsi.com/drivers-license-format/ | Pattern match and context | |
US_ITIN | US Individual Taxpayer Identification Number (ITIN). Nine digits that start with a "9" and contain a "7" or "8" as the 4 digit. | Pattern match and context | 912-34-1234 |
US_PASSPORT | A US passport number begins with a letter, followed by eight numbers | Pattern match and context | L12345678 |
US_SSN | A US Social Security Number (SSN) with 9 digits. | Pattern match and context | 001-12-1234 |
fiddler_custom_features = [
fdl.Enrichment(
name='Rag PII',
enrichment='pii',
columns=['question'], # one or more columns
allow_list=['fiddler'], # Optional: list of strings that are white listed
score_threshold=0.85, # Optional: float value for minimum possible confidence
),
]
model_spec = fdl.ModelSpec(
inputs=['question'],
custom_features=fiddler_custom_features,
)
The above example will lead to generation of new columns:
Column | Type | Description |
---|---|---|
FDL Rag PII (question) | bool | Whether any PII was detected. |
FDL Rag PII (question) Matches | str | What matches in raw text were flagged as potential PII (ex. ‘Douglas MacArthur,Korean’)? |
FDL Rag PII (question) Entities | str | What entites these matches were tagged as (ex. 'PERSON')? |
Note
PII enrichment is integrated with Presidio
Evaluate (beta)
Here is a summary of the three evaluation metrics for natural language generation:
Metric | Description | Strengths | Limitations |
---|---|---|---|
bleu | Measures precision of word n-grams between generated and reference texts | Simple, fast, widely used | Ignores recall, meaning, and word order |
rouge | Measures recall of word n-grams and longest common sequences | Captures more information than BLEU | Still relies on word matching, not semantic similarity |
meteor | Incorporates recall, precision, and additional semantic matching based on stems and paraphrasing | More robust and flexible than BLEU and ROUGE | Requires linguistic resources and alignment algorithms |
fiddler_custom_features = [
fdl.Enrichment(
name='QA Evaluate',
enrichment='evaluate',
columns=['correct_answer', 'generated_answer'],
config={
'reference_col': 'correct_answer', # required
'prediction_col': 'generated_answer', # required
'metrics': ..., # optional, default - ['bleu', 'rouge' , 'meteor']
}
),
]
model_spec = fdl.ModelSpec(
inputs=['question'],
custom_features=fiddler_custom_features,
)
The above example generates 6 new columns:
Column | Type |
---|---|
FDL QA Evaluate (bleu) | float |
FDL QA Evaluate (rouge1) | float |
FDL QA Evaluate (rouge2) | float |
FDL QA Evaluate (rougel) | float |
FDL QA Evaluate (rougelsum) | float |
FDL QA Evaluate (meteor) | floa |
Textstat (beta)
Supported Statistics
Statistic | Description | Usage |
---|---|---|
char_count | Total number of characters in text, including everything. | Assessing text length, useful for platforms with character limits. |
letter_count | Total number of letters only, excluding numbers, punctuation, spaces. | Gauging text complexity, used in readability formulas. |
miniword_count | Count of small words (usually 1-3 letters). | Specific readability analyses, especially for simplistic texts. |
words_per_sentence | Average number of words in each sentence. | Understanding sentence complexity and structure. |
polysyllabcount | Number of words with more than three syllables. | Analyzing text complexity, used in some readability scores. |
lexicon_count | Total number of words in the text. | General text analysis, assessing overall word count. |
syllable_count | Total number of syllables in the text. | Used in readability formulas, measures text complexity. |
sentence_count | Total number of sentences in the text. | Analyzing text structure, used in readability scores. |
flesch_reading_ease | Readability score indicating how easy a text is to read (higher scores = easier). | Assessing readability for a general audience. |
smog_index | Measures years of education needed to understand a text. | Evaluating text complexity, especially for higher education texts. |
flesch_kincaid_grade | Grade level associated with the complexity of the text. | Educational settings, determining appropriate grade level for texts. |
coleman_liau_index | Grade level needed to understand the text based on sentence length and letter count. | Assessing readability for educational purposes. |
automated_readability_index | Estimates the grade level needed to comprehend the text. | Evaluating text difficulty for educational materials. |
dale_chall_readability_score | Assesses text difficulty based on a list of familiar words for average American readers. | Determining text suitability for average readers. |
difficult_words | Number of words not on a list of commonly understood words. | Analyzing text difficulty, especially for non-native speakers. |
linsear_write_formula | Readability formula estimating grade level of text based on sentence length and easy word count. | Simplifying texts, especially for lower reading levels. |
gunning_fog | Estimates the years of formal education needed to understand the text. | Assessing text complexity, often for business or professional texts. |
long_word_count | Number of words longer than a certain length (often 6 or 7 letters). | Evaluating complexity and sophistication of language used. |
monosyllabcount | Count of words with only one syllable. | Readability assessments, particularly for simpler texts. |
fiddler_custom_features = [
fdl.Enrichment(
name='Text Statistics',
enrichment='textstat',
columns=['question'],
config={
'statistics' : [
'char_count',
'dale_chall_readability_score',
]
},
),
]
model_spec = fdl.ModelSpec(
inputs=['question'],
custom_features=fiddler_custom_features,
)
The above example leads to the creation of two additional columns:
Column | Type | Description |
---|---|---|
FDL Text Statistics (question) char_count | int | Character count of string in question column. |
FDL Text Statistics (question) dale_chall_readability_score | float | Readability score of string in question column. |
Sentiment (beta)
fiddler_custom_features = [
fdl.Enrichment(
name='Question Sentiment',
enrichment='sentiment',
columns=['question'],
),
]
model_spec = fdl.ModelSpec(
inputs=['question'],
custom_features=fiddler_custom_features,
)
The above example leads to creation of two columns:
Column | Type | Description |
---|---|---|
FDL Question Sentiment (question) compound | float | Raw score of sentiment. |
FDL Question Sentiment (question) sentiment | string | One of positive , negative and `neutral. |
Profanity (beta)
fiddler_custom_features = [
fdl.Enrichment(
name='Profanity',
enrichment='profanity',
columns=['prompt', 'response'],
config={'output_column_name': 'contains_profanity'},
),
]
model_spec = fdl.ModelSpec(
inputs=['prompt', 'response'],
custom_features=fiddler_custom_features,
)
The above example leads to creation of two columns:
Column | Type | Description |
---|---|---|
FDL Profanity (prompt) contains_profanity | bool | To indicate if input contains profanity in the value of the prompt column. |
FDL Profanity (response) contains_profanity | bool | To indicate if input contains profanity in the value of the response column. |
Answer Relevance (beta)
fiddler_custom_features = [
fdl.Enrichment(
name = 'Answer Relevance',
enrichment = 'answer_relevance',
columns = ['prompt_col', 'response_col'],
config = answer_relevance_config,
),
]
model_spec = fdl.ModelSpec(
inputs=['prompt_col', 'response_col'],
custom_features=fiddler_custom_features,
)
The above example will lead to the generation of a new column
Column | Type | Description |
---|---|---|
FDL Answer Relevance | bool | Binary metric, which is True if response is relevant to the prompt . |
Faithfulness (beta)
faithfulness_config = {
'context' : ['doc_0', 'doc_1', 'doc_2'],
'response' : 'response_col',
}
fiddler_custom_features = [
fdl.Enrichment(
name = 'Faithfulness',
enrichment = 'faithfulness',
columns = ['doc_0', 'doc_1', 'doc_2', 'response_col'],
config = faithfulness_config,
),
]
model_spec = fdl.ModelSpec(
inputs=['doc_0', 'doc_1', 'doc_2', 'response_col'],
custom_features=fiddler_custom_features,
)
The above example will lead to generation of new column:
Column | Type | Description |
---|---|---|
FDL Faithfulness | bool | Binary metric, which is True if the facts used inresponse is correctly used from the context columns. |
Coherence (beta
coherence_config = {
'response' : 'response_col',
}
fiddler_custom_features = [
fdl.Enrichment(
name = 'Coherence',
enrichment = 'coherence',
columns = ['response_col'],
config = coherence_config,
),
]
model_spec = fdl.ModelSpec(
inputs=['doc_0', 'doc_1', 'doc_2', 'response_col'],
custom_features=fiddler_custom_features,
)
The above example will lead to generation of new column:
Column | Type | Description |
---|---|---|
FDL Coherence | bool | Binary metric, which is True ifresponse makes coherent arguments which flow well. |
Conciseness (beta)
conciseness_config = {
'response' : 'response_col',
}
fiddler_custom_features = [
fdl.Enrichment(
name = 'Conciseness',
enrichment = 'conciseness',
columns = ['response'],
config = coherence_config,
),
]
model_spec = fdl.ModelSpec(
inputs=['prompt', 'doc_0', 'doc_1', 'doc_2', 'response'],
custom_features=fiddler_custom_features,
)
The above example will lead to generation of new column:
Column | Type | Description |
---|---|---|
FDL Conciseness | Binary metric, which is True ifresponse is concise, and not overly verbose. |
Toxicity (beta)
Dataset | PR-AUC | Precision | Recall |
---|---|---|---|
Toxic-Chat | 0.4 | 0.64 | 0.24 |
Usage
The code snippet shows how to enable toxicity scoring on the prompt
and response
columns for each event published to Fiddler.
fiddler_custom_features = [
fdl.Enrichment(
name='Toxicity',
enrichment='toxicity',
columns=['prompt', 'response'],
),
]
model_spec = fdl.ModelSpec(
inputs=['prompt', 'doc_0', 'doc_1', 'doc_2', 'response'],
custom_features=fiddler_custom_features,
)
The above example leads to creation of two columns each for prompt and response that contain the prediction probability and the model decision.
For example for the prompt column following two columns will be generated
Column | Type | Description |
---|---|---|
FDL Toxicity (prompt) toxicity_prob | float | Model prediction probability between 0-1. |
FDL Toxicity (prompt) contains_toxicity | bool | Model prediction either 0 or 1. |
Regex Match (beta)
fiddler_custom_features = [
fdl.Enrichment(
name='Regex - only digits',
enrichment='regex_match',
columns=['prompt', 'response'],
config = {
'regex' : '^\d+$',
}
),
]
model_spec = fdl.ModelSpec(
inputs=['prompt', 'doc_0', 'doc_1', 'doc_2', 'response'],
custom_features=fiddler_custom_features,
)
The above example will lead to generation of new column
Column | Type | Description |
---|---|---|
FDL Regex - only digits | category | Match or No Match, depending on the regex specified in config matching in the string. |
Topic (beta)
fiddler_custom_features = [
fdl.Enrichment(
name='Topics',
enrichment='topic_model',
columns=['response'],
config={'topics':['politics', 'economy', 'astronomy']},
),
]
model_spec = fdl.ModelSpec(
inputs=['prompt', 'doc_0', 'doc_1', 'doc_2', 'response'],
custom_features=fiddler_custom_features,
)
The above example leads to creation of two columns -
Column | Type | Description |
---|---|---|
FDL Topics (response) topic_model_scores | list[float] | Indicating probability of the given column in each of the topics specified in the Enrichment config. Each float value indicate probability of the given input classified in the corresponding topic, in the same order as topics. Each value will be between 0 and 1. The sum of values does not equal to 1, as each classification is performed independently of other topics. |
FDL Topics (response) max_score_topic | string | Topic with the maximum score from the list of topic names specified in the Enrichment config. |
Banned Keyword Detector (beta)
fiddler_custom_features = [
fdl.Enrichment(
name='Banned KW',
enrichment='banned_keywords',
columns=['prompt', 'response'],
config={'output_column_name': 'contains_banned_kw', 'banned_keywords':['nike', 'adidas', 'puma'],},
),
]
model_spec = fdl.ModelSpec(
inputs=['prompt', 'doc_0', 'doc_1', 'doc_2', 'response'],
custom_features=fiddler_custom_features,
)
The above example leads to creation of two columns -
Column | Type | Description |
---|---|---|
FDL Banned KW (prompt) contains_banned_kw | bool | To indicate if input contains one of the specified banned keywords in the value of the prompt column. |
FDL Banned KW (response) contains_banned_kw | bool | To indicate if input contains one of the specified banned keywords in the value of the response column. |
ModelTaskParams
Task parameters given to a particular model.
Parameter | Type | Default | Description |
---|---|---|---|
binary_classification_threshold | float | None | Threshold for labels. |
target_class_order | list | None | Order of target classes. |
group_by | str | None | Query/session id column for ranking models. |
top_k | int | None | Top k results to consider when computing ranking metrics. |
class_weights | list[float] | None | Weight of each class. |
weighted_ref_histograms | bool | None | Whether baseline histograms must be weighted or not when calculating drift metrics. |
ModelSchema
Model schema defines the list of columns associated with a model version.
Parameter | Type | Default | Description |
---|---|---|---|
schema_version | int | 1 | Schema version. |
columns | list[Column] | None | List of columns. |
ModelSpec
Model spec defines how model columns are used along with model task.
Parameter | Type | Default | Description |
---|---|---|---|
schema_version | int | 1 | Schema version. |
inputs | list[str] | None | Feature columns. |
outputs | list[str] | None | Prediction columns. |
targets | list[str] | None | Label columns. |
decisions | list[str] | None | Decisions columns. |
metadata | list[str] | None | Metadata columns |
custom_features | list[CustomFeature] | None | Custom feature definitions. |
CustomFeature
The base class for derived features such as Multivariate, VectorFeature, etc.
Parameter | Type | Default | Description |
---|---|---|---|
name | str | None | The name of the custom feature. |
type | CustomFeatureType | None | The type of custom feature. Must be one of the CustomFeatureType enum values. |
n_clusters | Optional[int] | 5 | The number of clusters. |
centroids | Optional[List] | None | Centroids of the clusters in the embedded space. Number of centroids equal to n_clusters . |
Multivariate
Represents custom features derived from multiple columns.
Parameter | Type | Default | Description |
---|---|---|---|
type | CustomFeatureType | CustomFeatureType.FROM_COLUMNS | Indicates this feature is derived from multiple columns. |
columns | List[str] | None | List of original columns from which this feature is derived. |
monitor_components | bool | False | Whether to monitor each column in columns as individual feature. If set to True , components are monitored and drift will be available. |
VectorFeature
Represents custom features derived from a single vector column.
Parameter | Type | Default | Description |
---|---|---|---|
type | CustomFeatureType | CustomFeatureType.FROM_VECTOR | Indicates this feature is derived from a single vector column. |
source_column | Optional[str] | None | Specifies the original column if this feature is derived from an embedding. |
column | str | None | The vector column name. |
TextEmbedding
Represents custom features derived from text embeddings.
Parameter | Type | Default | Description |
---|---|---|---|
type | CustomFeatureType | CustomFeatureType.FROM_TEXT_EMBEDDING | Indicates this feature is derived from a text embedding. |
n_tags | Optional[int] | 5 | How many tags(tokens) the text embedding uses in each cluster as the tfidf summarization in drift computation. |
ImageEmbedding
Represents custom features derived from image embeddings.
Parameter | Type | Default | Description |
---|---|---|---|
type | CustomFeatureType | CustomFeatureType.FROM_IMAGE_EMBEDDING | Indicates this feature is derived from an image embedding. |
Enrichment
Represents custom features derived from enrichment.
Parameter | Type | Default | Description |
---|---|---|---|
type | CustomFeatureType | CustomFeatureType.ENRICHMENT | Indicates this feature is derived from enrichment. |
columns | List[str] | None | List of original columns from which this feature is derived. |
enrichment | str | None | A string identifier for the type of enrichment to be applied. |
config | Dict[str, Any] | None | A dictionary containing configuration options for the enrichment. |
XaiParams
Represents the explainability parameters.
Parameter | Type | Default | Description |
---|---|---|---|
custom_explain_methods | List[str] | None | User-defined explain_custom methods of the model object defined in package.py. |
default_explain_method | Optional[str] | NOne | Default explanation method. |
DeploymentParams
Deployment parameters of a particular model.
Parameter | Type | Default | Description |
---|---|---|---|
artifact_type | str | ArtifactType.PYTHON_PACKAGE | Type of artifact upload. |
deployment_type | DeploymentType | None | Type of deployment. |
image_uri | Optional[str] | md-base/python/machine-learning:1.0.1 | Reference to the docker image to create a new runtime to serve the model. Check the available images on the Model Deployment page. |
replicas | Optional[str] | 1 | The number of replicas running the model. Minimum value: 1 Maximum value: 10 Default value: 1 |
cpu | Optional[str] | 100 | The amount of CPU (milli cpus) reserved per replica. Minimum value: 10 Maximum value: 4000 (4vCPUs) Default value: 100 |
memory | Optional[str] | 256 | The amount of memory (mebibytes) reserved per replica. Minimum value: 150 Maximum value: 16384 (16GiB) Default value: 256 |
RowDataSource
Explainability input source for row data.
Parameter | Type | Default | Description |
---|---|---|---|
row | Dict | None | Dictionary containing row details. |
EventIdDataSource
Explainability input source for event data.
Parameter | Type | Default | Description |
---|---|---|---|
event_id | str | None | Unique ID for event. |
env_id | Optional[Union[str, UUID]] | None | Unique ID for environment. |
env_type | EnvType | None | Environment type. |
DatasetDataSource
Reference data source for explainability.
Parameter | Type | Default | Description |
---|---|---|---|
env_type | EnvType | None | Environment type. |
num_samples | Optional[int] | None | Number of samples to select for computation. |
env_id | Optional[Union[str, UUID] | None | Unique ID for environment. |
SqlSliceQueryDataSource
Sql data source for explainability.
Parameter | Type | Default | Description |
---|---|---|---|
query | str | None | Query for slice. |
num_samples | Optional[int] | None | Number of samples to select for computation. |
Helper functions
set_logging
Set app logger at given log level.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
level | int | logging.INFO | Logging level from python logging module |
Usage
set_logging(logging.INFO)
Returns
None
group_by
Group the events by a column. Use this method to form the grouped data for ranking models.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
df | pd.DataFrame | - | Unique identifier for the alert rule. |
group_by_col | str | - | The column to group the data by. |
output_path | Optional[Path | str] | - | Optional argument, the path to write the grouped data to. If not specified, data won't be written anywhere. |
Usage
COLUMN_NAME = 'col_2'
grouped_df = group_by(df=df, group_by_col=COLUMN_NAME)
Returns
Return Type | Description |
---|---|
pd.Dataframe | Dataframe in grouped format. |