Modelling Interaction Effects by Using Extended WOE Variables with Applications to Credit Scoring

The term <i>credit scoring</i> refers to the application of formal statistical tools to support or automate loan-issuing decision-making processes. One of the most extended methodologies for credit scoring include fitting logistic regression models by using WOE explanatory variables, whi...

Full description

Bibliographic Details
Main Authors: Carlos Giner-Baixauli, Juan Tinguaro Rodríguez, Alejandro Álvaro-Meca, Daniel Vélez
Format: Article
Language:English
Published: MDPI AG 2021-08-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/9/16/1903
id doaj-85b921c7397f4936a482974ab0da202f
record_format Article
spelling doaj-85b921c7397f4936a482974ab0da202f2021-08-26T14:02:09ZengMDPI AGMathematics2227-73902021-08-0191903190310.3390/math9161903Modelling Interaction Effects by Using Extended WOE Variables with Applications to Credit ScoringCarlos Giner-Baixauli0Juan Tinguaro Rodríguez1Alejandro Álvaro-Meca2Daniel Vélez3Department of Statistics and Operations Research, Universidad Complutense de Madrid, 28040 Madrid, SpainDepartment of Statistics and Operations Research, Universidad Complutense de Madrid, 28040 Madrid, SpainDepartment of Preventive Medicine and Public Health, Universidad Rey Juan Carlos, 28922 Madrid, SpainDepartment of Statistics and Operations Research, Universidad Complutense de Madrid, 28040 Madrid, SpainThe term <i>credit scoring</i> refers to the application of formal statistical tools to support or automate loan-issuing decision-making processes. One of the most extended methodologies for credit scoring include fitting logistic regression models by using WOE explanatory variables, which are obtained through the discretization of the original inputs by means of classification trees. However, this Weight of Evidence (WOE)-based methodology encounters some difficulties in order to model interactions between explanatory variables. In this paper, an extension of the WOE-based methodology for credit scoring is proposed that allows constructing a new kind of WOE variable devised to capture interaction effects. Particularly, these new WOE variables are obtained through the simultaneous discretization of pairs of explanatory variables in a single classification tree. Moreover, the proposed extension of the WOE-based methodology can be complemented as usual by balance <i>scorecards</i>, which enable explaining why individual loans are granted or not granted from the fitted logistic models. Such explainability of loan decisions is essential for credit scoring and even more so by taking into account the recent law developments, e.g., the European Union’s GDPR. An extensive computational study shows the feasibility of the proposed approach that also enables the improvement of the predicitve capability of the standard WOE-based methodology.https://www.mdpi.com/2227-7390/9/16/1903regressiondiscretizationexplainabilityscorecards
collection DOAJ
language English
format Article
sources DOAJ
author Carlos Giner-Baixauli
Juan Tinguaro Rodríguez
Alejandro Álvaro-Meca
Daniel Vélez
spellingShingle Carlos Giner-Baixauli
Juan Tinguaro Rodríguez
Alejandro Álvaro-Meca
Daniel Vélez
Modelling Interaction Effects by Using Extended WOE Variables with Applications to Credit Scoring
Mathematics
regression
discretization
explainability
scorecards
author_facet Carlos Giner-Baixauli
Juan Tinguaro Rodríguez
Alejandro Álvaro-Meca
Daniel Vélez
author_sort Carlos Giner-Baixauli
title Modelling Interaction Effects by Using Extended WOE Variables with Applications to Credit Scoring
title_short Modelling Interaction Effects by Using Extended WOE Variables with Applications to Credit Scoring
title_full Modelling Interaction Effects by Using Extended WOE Variables with Applications to Credit Scoring
title_fullStr Modelling Interaction Effects by Using Extended WOE Variables with Applications to Credit Scoring
title_full_unstemmed Modelling Interaction Effects by Using Extended WOE Variables with Applications to Credit Scoring
title_sort modelling interaction effects by using extended woe variables with applications to credit scoring
publisher MDPI AG
series Mathematics
issn 2227-7390
publishDate 2021-08-01
description The term <i>credit scoring</i> refers to the application of formal statistical tools to support or automate loan-issuing decision-making processes. One of the most extended methodologies for credit scoring include fitting logistic regression models by using WOE explanatory variables, which are obtained through the discretization of the original inputs by means of classification trees. However, this Weight of Evidence (WOE)-based methodology encounters some difficulties in order to model interactions between explanatory variables. In this paper, an extension of the WOE-based methodology for credit scoring is proposed that allows constructing a new kind of WOE variable devised to capture interaction effects. Particularly, these new WOE variables are obtained through the simultaneous discretization of pairs of explanatory variables in a single classification tree. Moreover, the proposed extension of the WOE-based methodology can be complemented as usual by balance <i>scorecards</i>, which enable explaining why individual loans are granted or not granted from the fitted logistic models. Such explainability of loan decisions is essential for credit scoring and even more so by taking into account the recent law developments, e.g., the European Union’s GDPR. An extensive computational study shows the feasibility of the proposed approach that also enables the improvement of the predicitve capability of the standard WOE-based methodology.
topic regression
discretization
explainability
scorecards
url https://www.mdpi.com/2227-7390/9/16/1903
work_keys_str_mv AT carlosginerbaixauli modellinginteractioneffectsbyusingextendedwoevariableswithapplicationstocreditscoring
AT juantinguarorodriguez modellinginteractioneffectsbyusingextendedwoevariableswithapplicationstocreditscoring
AT alejandroalvaromeca modellinginteractioneffectsbyusingextendedwoevariableswithapplicationstocreditscoring
AT danielvelez modellinginteractioneffectsbyusingextendedwoevariableswithapplicationstocreditscoring
_version_ 1721191836655026176