Snowflake

In this article, we will be looking at loading data from Snowflake tables and using the data for the following tasks:

  1. Onboarding a model to Fiddler

  2. Uploading baseline data to Fiddler

  3. Publishing production data to Fiddler

Import data from Snowflake

In order to import data from Snowflake to a Jupyter notebook, we will use the snowflake library which can be installed using the following command in your Python environment.

pip install snowflake-connector-python

The following information is required in order to establish a connection to Snowflake:

  • Snowflake Warehouse

  • Snowflake Role

  • Snowflake Account

  • Snowflake User

  • Snowflake Password

These values can be obtained from your Snowflake account under the ‘Admin’ option in the Menu as shown below or by running the queries below:

  • Warehouse - select CURRENT_WAREHOUSE()

  • Role - select CURRENT_ROLE()

  • Account - select CURRENT_ACCOUNT()

'User' and 'Password' are the same that you use when logging in to your Snowflake account.

Once you have this information, you can set up a Snowflake connector using the following code:

# establish Snowflake connection
connection = connector.connect(
  user=snowflake_username,
  password=snowflake_password,
  account=snowflake_account,
  role=snowflake_role,
  warehouse=snowflake_warehouse
)

You can then write a custom SQL query and import the data to a pandas dataframe.

# sample SQL query
sql_query = 'select * from FIDDLER.FIDDLER_SCHEMA.CHURN_BASELINE LIMIT 100'

# create cursor object
cursor = connection.cursor()

# execute SQL query inside Snowflake
cursor.execute(sql_query)

baseline_df = cursor.fetch_pandas_all()

Publish Production Events

Now that we have data imported from Snowflake to a dataframe, we can refer to the following pages to:

  1. Onboard a model using the baseline dataset for the model schema inference sample.

  2. Upload a Baseline dataset, which is optional but recommended for monitoring comparisons.

  3. Publish production events for continuous monitoring.

Last updated

Was this helpful?