Publishes a batch of events to Fiddler asynchronously.

Input Parameter

Type

Default

Description

project_id

str

None

The unique identifier for the project.

model_id

str

None

A unique identifier for the model.

batch_source

Union[pd.Dataframe, str]

None

Either a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are
CSV (.csv)
Parquet (.pq)

  • Pickled DataFrame (.pkl)

id_field

Optional [str]

None

The field containing event IDs for events in the batch. If not specified, Fiddler will generate its own ID, which can be retrived using the get_slice API.

update_event

Optional [bool]

None

If True, will only modify an existing event, referenced by id_field. If an ID is provided for which there is no event, no change will take place.

timestamp_field

Optional [str]

None

The field containing timestamps for events in the batch. The format of these timestamps is given by timestamp_format. If no timestamp is provided for a given row, the current time will be used.

timestamp_format

Optional [fdl.FiddlerTimestamp]

fdl.FiddlerTimestamp.INFER

The format of the timestamp passed in event_timestamp. Can be one of
-fdl.FiddlerTimestamp.INFER

  • fdl.FiddlerTimestamp.EPOCH_MILLISECONDS
  • fdl.FiddlerTimestamp.EPOCH_SECONDS
  • fdl.FiddlerTimestamp.ISO_8601

data_source

Optional [fdl.BatchPublishType]

None

The location of the data source provided. By default, Fiddler will try to infer the value. Can be one of

  • fdl.BatchPublishType.DATAFRAME
  • fdl.BatchPublishType.LOCAL_DISK
  • fdl.BatchPublishType.AWS_S3

casting_type

Optional [bool]

False

If True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.

credentials

Optional [dict]

None

A dictionary containing authorization information for AWS or GCP.

For AWS, the expected keys are

  • 'aws_access_key_id'
  • 'aws_secret_access_key'
  • 'aws_session_token'For GCP, the expected keys are
  • 'gcs_access_key_id'
  • 'gcs_secret_access_key'
  • 'gcs_session_token'

group_by

Optional [str]

None

The field used to group events together when computing performance metrics (for ranking models only).

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

df_events = pd.read_csv('events.csv')

client.publish_events_batch(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        batch_source=df_events,
        timestamp_field='inference_date')