Publishes a batch of events to Fiddler asynchronously.
Input Parameter | Type | Default | Description |
---|---|---|---|
project_id | str | None | The unique identifier for the project. |
model_id | str | None | A unique identifier for the model. |
batch_source | Union[pd.Dataframe, str] | None | Either a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are CSV (.csv) Parquet (.pq) - Pickled DataFrame (.pkl) |
id_field | Optional [str] | None | The field containing event IDs for events in the batch. If not specified, Fiddler will generate its own ID, which can be retrived using the get_slice API. |
update_event | Optional [bool] | None | If True, will only modify an existing event, referenced by id_field. If an ID is provided for which there is no event, no change will take place. |
timestamp_field | Optional [str] | None | The field containing timestamps for events in the batch. The format of these timestamps is given by timestamp_format. If no timestamp is provided for a given row, the current time will be used. |
timestamp_format | Optional [fdl.FiddlerTimestamp] | fdl.FiddlerTimestamp.INFER | The format of the timestamp passed in event_timestamp. Can be one of -fdl.FiddlerTimestamp.INFER - fdl.FiddlerTimestamp.EPOCH_MILLISECONDS - fdl.FiddlerTimestamp.EPOCH_SECONDS - fdl.FiddlerTimestamp.ISO_8601 |
data_source | Optional [fdl.BatchPublishType] | None | The location of the data source provided. By default, Fiddler will try to infer the value. Can be one of - fdl.BatchPublishType.DATAFRAME - fdl.BatchPublishType.LOCAL_DISK - fdl.BatchPublishType.AWS_S3 |
casting_type | Optional [bool] | False | If True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object. |
credentials | Optional [dict] | None | A dictionary containing authorization information for AWS or GCP. For AWS, the expected keys are - 'aws_access_key_id' - 'aws_secret_access_key' - 'aws_session_token'For GCP, the expected keys are - 'gcs_access_key_id' - 'gcs_secret_access_key' - 'gcs_session_token' |
group_by | Optional [str] | None | The field used to group events together when computing performance metrics (for ranking models only). |
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
df_events = pd.read_csv('events.csv')
client.publish_events_batch(
project_id=PROJECT_ID,
model_id=MODEL_ID,
batch_source=df_events,
timestamp_field='inference_date')
Return Type | Description |
---|---|
dict | A dictionary object which reports the result of the batch publication. |
{'status': 202,
'job_uuid': '4ae7bd3a-2b3f-4444-b288-d51e07b6736d',
'files': ['ssoqj_tmpzmczjuob.csv'],
'message': 'Successfully received the event data. Please allow time for the event ingestion to complete in the Fiddler platform.'}