Customer Churn Prediction

Churn prediction is a common use case in the machine learning domain. Churn means “leaving the company”. It is very critical for a business to have an idea about why and when customers are likely to churn. Having a robust and accurate churn prediction model helps businesses to take action to prevent customers from leaving the company. Machine learning models have proved to be effective in detecting churn. However, if left unattended, the performance of churn models can degrade over time leading to losing customers.

The Fiddler AI Observability platform provides a variety of tools that can be used to monitor, explain, analyze, and improve the performance of your machine learning-based churn model.

In this article we will go over a churn example and how we can mitigate performance degradation in a churn machine learning model.

Refer to the colab notebook to learn how to -

  1. Onboard model on the Fiddler platform
  2. Publish events on the Fiddler platform
  3. Use the Fiddler API to run explanations

Example - Model Performance Degradation due to Data Integrity Issues

Step 1 - Setting up baseline and publishing production events

Please refer to our Getting Started guide for a step-by-step walkthrough of how to upload baseline and production data to the Fiddler platform.

Step 2 - Monitor Drift

When we check the monitoring dashboard, we notice a drop in the predicted churn value and a rise in the predicted churn drift value. Our next step is to check if this has resulted in a drop in performance.

1999

Monitor Drift

Step 3 - Monitor Performance Metrics

We use precision, recall, and F1-score as accuracy metrics for this example. We’re choosing these metrics as they are suited for classification problems and help us in identifying the number of false positives and false negatives. We notice that although the precision has remained constant, there is a drop in the F1-score and recall, which means that there are a few customers who are likely to churn but the model is not able to predict their outcome correctly.

There could be a number of reasons for drop in performance, some of them are-

  1. Cases of extreme events (Outliers)
  2. Data distribution changes
  3. Model/Concept drift
  4. Pipeline health issues

While Pipeline health issues could be due to a component in the Data pipeline failing, the first 3 could be due to changes in data. In order to check that we can go to the Data Integrity tab to first check if the incoming data is consistent with the baseline data.

1999

Monitor Performance Metrics

Step 4 - Data Integrity

Our next step would be to check if this could be due to any data integrity issues. On navigating to the Data Integrity tab under the Monitor tab, we see that there has been a range violation. On selecting the bins which have the range violations, we notice it is due to the field numofproducts.

It is advised to check all the fields which cause data integrity violations. Since we see a range violation, we can check how much the data has drifted.

1999

Data Integrity

Step 5 - Check the impact of drift on ‘numofproducts’ features

Our next step would be to go back to the Data Drift tab to measure the amount of drift in the field numofproducts. The drift is calculated using Jensen Shannon Divergence, which compares the distributions of the two data sets being compared.

We can select the bin where we see an increase in average value as well as drift. We see a significant increase in the numofproducts average value and drift. We can also see there is a difference in the distribution of the baseline and production data which leads to a drift.

Next step could be to find out if the change in distribution was only for a subsection of data or was it due to other factors like time (seasonality etc.), fault in data reporting (sensor data), change in the unit in which the metric is reported etc.
Seasonality could be observed by plotting the data across time (provided we have enough data), a fault in data reporting would mean missing values, and change in unit of data would mean change in values for all subsections of data.

In order to investigate if the change was only for a subsection of data, we will go to the Analyze tab. We can do this by clicking Export bin and feature to Analyze.

1999

Impact of Drift

Step 6 - Root Cause Analysis in the ‘Analyze’ tab

In the analyze tab, we will have an auto-generated SQL query based on our selection in the Monitor tab, we can also write custom SQL queries to investigate the data.

We check the distribution of the field numofproducts for our selection. We can do this by selecting Chart Type - Feature Distribution on the RHS of the tab.

1999

Root Cause Analysis - 1

We further check the performance of the model for our selection by selecting the Chart Type - Slice Evaluation.

1578

Root Cause Analysis - 2

In order to check if the change in the range violation has occurred for a subsection of data, we can plot it against the categorical variable. In our case, we can check distribution of numofproducts against age and geography. For this we can plot a feature correlation plot for two features by querying data and selecting Chart type - Feature Correlation.

On plotting the feature correlation plot of gender vs numofprodcuts, we observe the distribution to be similar.

512

Root Cause Analysis - 3

464

Root Cause Analysis - 4

For the sake of this example, let’s say that state of Hawaii (which is a value in the geography field in the data) announced that it has eased restrictions on number of loans, since loans is one of products, our hypothesis is the numofproducts would be higher for the state. To test this we will check the feature correlation between geography and numofproducts.

463

Root Cause Analysis - 5

We do see higher values for the state of Hawaii as compared to other states. We can further check distribution for the field numofproducts just for the state of Hawaii.

1999

Root Cause Analysis - 6

On checking performance for the subset of Hawaii, we see a huge performance drop.

1624

Root Cause Analysis - 7

On the contrary, we see a good performance for the subset of data without the ‘Hawaii’.

924

Root Cause Analysis - 8

1606

Root Cause Analysis - 9

Step 7 - Measuring the impact of the ‘numofproducts’ feature

In order to measure the impact of features - numofproducts, we can navigate back to the Monitor tab. We can see that the prediction drift impact is highest for numofproducts due to its high drift value, which means it is contributing the most to the prediction drift.

1999

Feature Impact - 1

We can further measure the attribution of the feature - numofproducts for a single data point. We can select a data point which was incorrectly predicted to not churn (false negative). We can check point explanations for a point from the Analyze by running a query or from the Explain tab. Below we check point explanations for a data point form analyze tab by clicking the bulb symbol from the query results.

1654

Feature Impact - 2

We see that the feature - numofproducts attributes significantly towards the data point being predicted not to churn.

1999

Feature Impact - 3

We have seen that the performance of the churn model drops due to range violation in one of the features. We can improve the performance by retraining the model with new data but before that we must perform mitigation actions which would help us in preemptively detecting the model performance degradation and inform our retraining frequency.

Step 8 - Mitigation Actions

1618

Add to dashboard

  1. Add to dashboard
    We can add the chart generated to the dashboard by clicking on Pin this chart on the RHS of the Analyze tab. This would help us in monitoring importance aspects of the model.

  2. Add alerts
    We can alert users to make sure we are notified the next time there is a performance degradation. For instance, in this example, there was a performance degradation due to range data integrity violation. To mitigate this, we can set up an alert which would notify us in case the percentage range violation exceeds a certain threshold (10% would be a good number in our case). We can also set up alerts on drift values for prediction etc. Check out this link to learn how to set up alerts on Fiddler platform.