Understanding the Performance of Machine Learning Models to Predict Credit Default: A Novel Approach for Supervisory Evaluation

Understanding the Performance of Machine Learning Models to Predict Credit Default: A Novel Approach for Supervisory Evaluation

Series: Research Features.

Author: Andrés Alonso and José Manuel Carbó

Full document

PDF
Understanding the Performance of Machine Learning Models to Predict Credit Default: A Novel Approach for Supervisory Evaluation (184 KB)

Abstract

We study the economic impact for financial institutions of using machine learning (ML) models in credit default prediction. We do so by using a unique and anonymized database from a major Spanish bank. We first measure the statistical performance in terms of predictive power, both in classification and calibration, comparing models like Logit and Lasso, with more advanced ones like Trees (CART), Random Forest, XGBoost and Deep Learning. We find that ML models outperforms traditional ones, although more complex ML algorithms do not necessarily predict better. We then translate this into economic impact by estimating the savings in regulatory capital that an institution could achieve when using a ML model instead of a simpler one to compute the risk-weighted assets following the Internal Ratings Based (IRB) approach. Our benchmark results show that implementing XGBoost instead of Lasso could yield savings from 12.4% to 17% in capital requirements, depending on the type of underlying assets.