MSN-05 // DEPLOYED

AUTO INSURANCE CHURN

Churn modeling on 92,849 policyholders — SMOTE + Gradient Boosting at ROC-AUC 0.70.

92,849 RecordsROC-AUC 0.70SMOTE PipelineLeakage Catch

BRIEFING

Five classifiers on 92,849 policyholder records to flag retention risk early. The hard part wasn't the model — it was the severe class imbalance (only 11.5% churn). A SMOTE resampling pipeline rebalanced the training data so precision and recall held up instead of collapsing onto the majority class, and a data-leakage catch — dropping a column only populated after a customer churns — kept the score honest. Gradient Boosting came out on top at ROC-AUC 0.70, threshold-tuned to favor recall so more at-risk policyholders get flagged.

ROLE / METHOD / OUTCOME

Role
Solo build.
Method
Five-model comparison on a SMOTE pipeline; champion chosen by ROC-AUC, tuned for recall.
Outcome
Gradient Boosting at ROC-AUC 0.70, with a leakage fix an earlier run had missed.

STACK: Python · scikit-learn · Gradient Boosting · SMOTE

MORE MISSIONS

Eight more case files — ML, BI, and supply-chain strategy.

VIEW ALL MISSIONS