Publishes a batch of events to Fiddler asynchronously using a schema for locating fields within complex data structures.
Input Parameter | Type | Default | Description |
---|---|---|---|
batch_source | Union[pd.Dataframe, str] | None | Either a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are - CSV (.csv) |
publish_schema | dict | None | A dictionary used for locating fields within complex or nested data structures. |
data_source | Optional [fdl.BatchPublishType] | None | The location of the data source provided. By default, Fiddler will try to infer the value. Can be one of - fdl.BatchPublishType.DATAFRAME - fdl.BatchPublishType.LOCAL_DISK - fdl.BatchPublishType.AWS_S3 |
credentials | Optional [dict] | None | A dictionary containing authorization information for AWS or GCP. For AWS, the expected keys are - 'aws_access_key_id' - 'aws_secret_access_key' - 'aws_session_token'For GCP, the expected keys are - 'gcs_access_key_id' - 'gcs_secret_access_key' - 'gcs_session_token' |
group_by | Optional [str] | None | The field used to group events together when computing performance metrics (for ranking models only). |
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
path_to_batch = 'events_batch.avro'
schema = {
'__static': {
'__project': PROJECT_ID,
'__model': MODEL_ID
},
'__dynamic': {
'feature_1': 'features/feature_1',
'feature_2': 'features/feature_2',
'feature_3': 'features/feature_3',
'output_column': 'outputs/output_column',
'target_column': 'targets/target_column'
ORG = '__org'
13 MODEL = '__model'
14 PROJECT = '__project'
15 TIMESTAMP = '__timestamp'
16 DEFAULT_TIMESTAMP = '__default_timestamp'
17 TIMESTAMP_FORMAT = '__timestamp_format'
18 EVENT_ID = '__event_id'
19 IS_UPDATE_EVENT = '__is_update_event'
20 STATUS = '__status'
21 LATENCY = '__latency'
22 ITERATOR_KEY = '__iterator_key'
}
}
client.publish_events_batch_schema(
batch_source=path_to_batch,
publish_schema=schema
)
Return Type | Description |
---|---|
dict | A dictionary object which reports the result of the batch publication. |
{'status': 202,
'job_uuid': '5ae7bd3a-2b3f-4444-b288-d51e098a01d',
'files': ['rroqj_tmpzmczjttb.csv'],
'message': 'Successfully received the event data. Please allow time for the event ingestion to complete in the Fiddler platform.'}