4.4 Changing probability cutoff
Note that caret
is using a probability cutoff of 0.5 to determine whether a person will buy insurance or not. We can change that cutoff to 0.3 to see whether we get better results.
predict_custom <- predict(modelRF2,
select(dt4_test, -CarInsurance, -starts_with("Call")),
type = "prob") %>%
mutate(new_class = factor(ifelse(Yes >= 0.3, "Yes", "No"))) %>%
select(new_class)
confusionMatrix(predict_custom$new_class,
reference = dt4_test$CarInsurance,
positive = "Yes")
## Confusion Matrix and Statistics
##
## Reference
## Prediction No Yes
## No 433 9
## Yes 46 311
##
## Accuracy : 0.9312
## 95% CI : (0.9113, 0.9477)
## No Information Rate : 0.5995
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.8594
##
## Mcnemar's Test P-Value : 1.208e-06
##
## Sensitivity : 0.9719
## Specificity : 0.9040
## Pos Pred Value : 0.8711
## Neg Pred Value : 0.9796
## Prevalence : 0.4005
## Detection Rate : 0.3892
## Detection Prevalence : 0.4468
## Balanced Accuracy : 0.9379
##
## 'Positive' Class : Yes
##
With a revised cutoff of 0.3, although we now identify too many prospective buyers, we do not unnecessarily leave out a lot of prospective customers. This is also a good lesson for us. We can’t improve the overall accuracy of the model just by changing the default cutoff of 0.5.