Baseline datasets are used for making comparisons with production data.
A baseline dataset should be sampled from your model's training set, so it can serve as a representation of what the model expects to see in production.
A Model is a representation of your machine learning model which can be used for monitoring, explainability, and more. You do not need to upload your model artifact in order to onboard your model, but doing so will significantly improve the quality of explanations generated by Fiddler.
Model
Model object contains the below parameters.
constructor()
Initialize a new model instance.
Parameters
from_data()
Build model instance from the given dataframe or file(csv/parquet).
from_data will not create a model entry on Fiddler Platform.
Instead this method only returns a model instance which can be edited, call .create() to onboard the model to
Fiddler Platform.
spec is optional to from_data method. However, a spec with at least inputs is required for model onboarding.
Make sure spec is passed to from_data method if model requires custom features. This method generates centroids
which are needed for custom feature drift computation
If version is not explicitly passed, Fiddler Platform will treat it as v1 version of the model.
Since Model contains a lot of information, list operations does not return all the fields of a model.
Instead this method returns ModelCompact objects on which .fetch() can be called to get the complete Model
instance.
For most of the use-cases, ModelCompact objects are sufficient.
update()
Update an existing model. Only following fields are allowed to be updated, backend will ignore if any other
field is updated on the instance.
Parameters
version parameter is available from fiddler-client==3.1 onwards
Duplicate the model instance with the given version name.
This call will not save the model on Fiddler Platform. After making changes to the model instance, call .create() to add the model version to Fiddler Platform.
This method is available from fiddler-client==3.1 onwards.
Model deletion is an async process, hence a job object is returned on delete() call.
Call job.wait() to wait for the job to complete. If you are planning to create a model with the same
name, please wait for the job to complete, otherwise backend will not allow new model with same name.
# Before publishing, make sure you set up the necessary fields of the model(if any).# If you set the fields to non-empty value, We expect them passed in the source.model.event_ts_col ='timestamp'model.event_id_col ='event_id'model.update()
List is only supported for production data but not for pre-production.
Events are published as a stream. This mode is recommended If you have a high volume of continuous real-time traffic of events, as it allows for more efficient processing on our backend.
It returns a list of event_id for each of the published events.
# Publish list of dictionary objectsevents = [{'A':56,'B':68,'C':67,'D':27,'event_id':'A1','timestamp':'2024-05-01 00:00:00'},{'A':43,'B':59,'C':64,'D':18,'event_id':'A2','timestamp':'2024-05-01 00:00:00'}, ...]event_ids = model.publish( source=events, environment=fdl.EnvType.PRODUCTION)
Notes
In this example where model.event_id_col=event_id, we expect event_id as the required key of the dictionary. Otherwise if you keep model.event_id_col=None, our backend will generate unique event ids and return these back to you. Same for model.event_ts_col, we assign current time as event timestamp in case of None.
Publish production events from file
Batch events is faster if you want to publish a large-scale set of historical data.
if you need to update the target or metadata columns for a previously published production event, set update=True. For more details please refer to Updating Events. Note only production events can be updated.
Update production events from list
events_update = [{'A': [0],# suppose 'A' is the target'B': [0],# suppose 'B' is the medadata'event_id': ['A1'],# required model.event_id_col},{'A': [1],# suppose 'A' is the target'B': [1],# suppose 'B' is the medadata'event_id': ['A2'],# required model.event_id_col},]event_ids = model.publish( source=events_update, environment=fdl.EnvType.PRODUCTION, update=True,)
Update production events from dataframe
df_update = pd.DataFrame(
{
'A': [0, 1], # suppose 'A' is the target
'B': [0, 1], # suppose 'B' is the medadata
'event_id': ['A1', 'A2'], # required model.event_id_col
}
)
event_ids = model.publish(
source=df_update,
environment=fdl.EnvType.PRODUCTION,
update=True,
)
Returns
In case of streaming publish
In case of batch publish
Model Compact
Model object contains the below parameters.
fetch()
Fetch the model instance from Fiddler Platform.
Parameters
No
Returns
Raises
Model deployment
Get model deployment object of a particular model.
Model deployment:
Model deployment object contains the below parameters.
Update model deployment
Update an existing model deployment.
Parameters
Usage
# Update CPU allocation and activate the model pod
model_id = 'a920ddb6-edb7-473b-a5f7-035f91e1d53a'
model = fdl.Model.get(model_id)
model_deployment = model.deployment
model_deployment.cpu = 300
model_deployment.active = True
model_deployment.update()
Returns
No
Raises
Organizations
Organization in which all the projects, models are present.
Organization:
Organization object contains the below parameters.
Projects
Projects are used to organize your models and datasets. Each project can represent a machine learning task (e.g. predicting house prices, assessing creditworthiness, or detecting fraud).
A project can contain one or more models (e.g. lin_reg_house_predict, random_forest_house_predict).