Kafka Integration

Fiddler Kafka connector is an optional Fiddler service that connects to a Kafka topic containing production events for a model, and publishes the events to Fiddler.

Kafka Integration Pre-requisites

We assume that the user has an account with Fiddler, has already created a project and onboarded a model. We will need your Fiddler url_id, project_id, and model_id to configure the Kafka connector.

Installation

For Fiddler on-premise installations, the Kafka connector runs on Kubernetes within your own environment. It is packaged as a Helm chart for quick installation:

helm repo add fiddler https://helm.fiddler.ai/stable/

helm repo update

kubectl -n kafka create secret generic fiddler-credentials --from-literal=auth=<API-KEY>

helm install fiddler-kafka fiddler/fiddler-kafka \
    --devel \
    --namespace kafka \
    --set fiddler.url=https://<FIDDLER-URL> \
    --set fiddler.org=<ORG> \
    --set fiddler.project_id=<PROJECT-ID> \
    --set fiddler.model_id=<MODEL-ID> \
    --set fiddler.ts_field=timestamp \
    --set fiddler.ts_format=INFER \
    --set kafka.host=kafka \
    --set kafka.port=9092 \
    --set kafka.topic=<KAFKA-TOPIC> \
    --set kafka.security_protocol=SSL \
    --set kafka.ssl_cafile=cafile \
    --set kafka.ssl_certfile=certfile \
    --set kafka.ssl_keyfile=keyfile \
    --set-string kafka.ssl_check_hostname=False

This creates a deployment that reads event data from the Kafka topic and publishes it to the configured Fiddler model. The deployment can be scaled as needed; however, note that if the Kafka topic is not partitioned, scaling will not result in any gains.

Limitations

  1. The connector assumes that there is a single dedicated topic containing production events for a given model. Multiple deployments can be created, one for each model, and scaled independently.

  2. The connector assumes that events are published as JSON serialized dictionaries of key-value pairs. Support for other formats can be added on request. As an example, a Kafka message should look like the following:

{
    “feature_1”: 20.7,
    “feature_2”: 45000,
    “feature_3”: true,
    “output_column”: 0.79,
    “target_column”: 1,
    “ts”: 1637344470000,
}

Last updated

Was this helpful?