In this article, we will be looking at loading data from Snowflake tables and using the data for the following tasks-
- Uploading baseline data to Fiddler
- Onboarding a model to Fiddler and creating a surrogate
- Publishing production data to Fiddler
In order to import data from Snowflake to Jupyter notebook, we will use the snowflake library, this can be installed using the following command in your Python environment.
pip install snowflake-connector-python
Once the library is installed, we would require the following to establish a connection to Snowflake
- Snowflake Warehouse
- Snowflake Role
- Snowflake Account
- Snowflake User
- Snowflake Password
These can be obtained from your Snowflake account under the ‘Admin’ option in the Menu as shown below or by running the queries -
- Warehouse - select CURRENT_WAREHOUSE()
- Role - select CURRENT_ROLE()
- Account - select CURRENT_ACCOUNT()
'User' and 'Password' are the same as one used for logging into your Snowflake account.
Once you have this information, you can set up a Snowflake connector using the following code -
# establish Snowflake connection connection = connector.connect(user=snowflake_username, password=snowflake_password, account=snowflake_account, role=snowflake_role, warehouse=snowflake_warehouse )
You can then write a custom SQL query and import the data to a pandas dataframe.
# sample SQL query sql_query = 'select * from FIDDLER.FIDDLER_SCHEMA.CHURN_BASELINE LIMIT 100' # create cursor object cursor = connection.cursor() # execute SQL query inside Snowflake cursor.execute(sql_query) baseline_df = cursor.fetch_pandas_all()
In order to publish production events from Snowflake, we can load the data to a pandas dataframe and publish it to fiddler using client.publish_events_batch api.
Now that we have data imported from Snowflake to a jupyter notebook, we can refer to the following notebooks to
Updated 4 months ago