
Series: Working Papers. 2222.
Author: Andrés Alonso and José Manuel Carbó.
Published in: Financial Innovation, Volume 8, Issue 70, July 2022 and Computational Economics, online (Enero 2025)
Full document
Abstract
One of the biggest challenges for the application of machine learning (ML) models in finance is how to explain their results. In recent years, different interpretability techniques have appeared to assist in this task, although their usefulness is still a matter of debate. In this article we contribute to the debate by creating a framework to assess the accuracy of these interpretability techniques. We start from the generation of synthetic data sets, following an approach that allows us to control the importance of each explanatory variable (feature) in our target variable. By defining the importance of features ourselves, we can then calculate to what extent the explanations given by the interpretability techniques match the underlying truth. Therefore, if in our synthetic dataset we define a feature as relevant to the target variable, the interpretability technique should also identify it as a relevant feature. We run an empirical example in which we generate synthetic datasets intended to resemble underwriting and credit rating datasets, where the target variable is a binary variable representing applicant default. We then use non-interpretable ML models, such as deep learning, to predict default, and then explain their results using two popular interpretability techniques, SHAP and permutation Feature Importance (FI). Our results using the proposed framework suggest that SHAP is better at interpreting relevant features as such, although the results may vary significantly depending on the characteristics of the dataset and the ML model used. We conclude that generating synthetic datasets shows potential as a useful approach for supervisors and practitioners looking for solutions to assess the interpretability tools available for ML models in the financial sector.