Publishes a batch of events to Fiddler asynchronously using a schema for locating fields within complex data structures.

Input Parameter

Type

Default

Description

batch_source

Union[pd.Dataframe, str]

None

Either a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are

  • CSV (.csv)

publish_schema

dict

None

A dictionary used for locating fields within complex or nested data structures.

data_source

Optional [fdl.BatchPublishType]

None

The location of the data source provided. By default, Fiddler will try to infer the value. Can be one of

  • fdl.BatchPublishType.DATAFRAME
  • fdl.BatchPublishType.LOCAL_DISK
  • fdl.BatchPublishType.AWS_S3

credentials

Optional [dict]

None

A dictionary containing authorization information for AWS or GCP.

For AWS, the expected keys are

  • 'aws_access_key_id'
  • 'aws_secret_access_key'
  • 'aws_session_token'For GCP, the expected keys are
  • 'gcs_access_key_id'
  • 'gcs_secret_access_key'
  • 'gcs_session_token'

group_by

Optional [str]

None

The field used to group events together when computing performance metrics (for ranking models only).

PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'

path_to_batch = 'events_batch.avro'

schema = {
    '__static': {
        '__project': PROJECT_ID,
        '__model': MODEL_ID
    },
    '__dynamic': {
        'feature_1': 'features/feature_1',
        'feature_2': 'features/feature_2',
        'feature_3': 'features/feature_3',
        'output_column': 'outputs/output_column',
        'target_column': 'targets/target_column'
      ORG = '__org'
13      MODEL = '__model'
14      PROJECT = '__project'
15      TIMESTAMP = '__timestamp'
16      DEFAULT_TIMESTAMP = '__default_timestamp'
17      TIMESTAMP_FORMAT = '__timestamp_format'
18      EVENT_ID = '__event_id'
19      IS_UPDATE_EVENT = '__is_update_event'
20      STATUS = '__status'
21      LATENCY = '__latency'
22      ITERATOR_KEY = '__iterator_key'
    }
}

client.publish_events_batch_schema(
    batch_source=path_to_batch,
    publish_schema=schema
)