Model performance evaluation is one of the key tasks in the data science process. It indicates how successful the trained model is at scoring a dataset.
Once your trained model is loaded into Fiddler, you should be able to click on Evaluation to see its performance.
To measure model performance for Regression tasks, we have a few tools like
- Coefficient of determination called R2
- Measures how well the actual outcomes are replicated by the model
- R2 = Variance explained by the model / Total Variance
- Mean Absolute Error (MAE)
- Measures the average magnitude of the error in a set of predictions, without considering their direction.
- MAE = Sum of all observation[Abs(predicted value - actual value)]/number of observations
- Root mean square error (RMSE)
- Shows the variation between the predicted and the actual value.
- RMSE = SQRT[Sum of all observation (predicted value - actual value)^2/number of observations]
To measure model performance for Classification tasks, we use the following tools:
- Log Loss: Measures the performance of a classification model where the prediction input is a probability value between 0 and 1. The goal of the ML model is to minimize this value.
- Confusion Matrix: A table that tells us how many actual values and predicted values exist for different classes. Also referred as Error Matrix.
- Receiver Operating Characteristic (ROC) Curve: A graph showing the performance of a classification model at different classification thresholds. Plots the true positive rate (TPR), also known as recall, against the false positive rate (FPR).
- Calibration Plot: A graph that tell us how well the model is calibrated. The plot is obtained by dividing the predictions into 10 quantile buckets (0-10th percentile, 10-20th percentile, etc.). The average predicted probability is plotted against the true observed probability for that set of points.