Regularization of Autoencoders for Bank Client Profiling Based on Financial Transactions

Predicting if a client is worth giving a loan—credit scoring—is one of the most essential and popular problems in banking. Predictive models for this goal are built on the assumption that there is a dependency between the client’s profile before the loan approval and their future behavior. However,...

Full description

Bibliographic Details
Main Authors: Andrey Filchenkov, Natalia Khanzhina, Arina Tsai, Ivan Smetannikov
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Risks
Subjects:
Online Access:https://www.mdpi.com/2227-9091/9/3/54
id doaj-1a575d90d6064326a0151a393d471bd9
record_format Article
spelling doaj-1a575d90d6064326a0151a393d471bd92021-03-18T00:06:52ZengMDPI AGRisks2227-90912021-03-019545410.3390/risks9030054Regularization of Autoencoders for Bank Client Profiling Based on Financial TransactionsAndrey Filchenkov0Natalia Khanzhina1Arina Tsai2Ivan Smetannikov3Machine Learning Lab, ITMO University, 49 Kronverksky Pr., St. Petersburg 197101, RussiaMachine Learning Lab, ITMO University, 49 Kronverksky Pr., St. Petersburg 197101, RussiaComputer Technologies Department, Formerly ITMO University, 49 Kronverksky Pr., St. Petersburg 197101, RussiaMachine Learning Lab, ITMO University, 49 Kronverksky Pr., St. Petersburg 197101, RussiaPredicting if a client is worth giving a loan—credit scoring—is one of the most essential and popular problems in banking. Predictive models for this goal are built on the assumption that there is a dependency between the client’s profile before the loan approval and their future behavior. However, circumstances that cause changes in the client’s behavior may not depend on their will and cannot be predicted by their profile. Such clients may be considered “noisy” as their eventual belonging to the defaulters class results rather from random factors than from some predictable rules. Excluding such clients from the dataset may be helpful in building more accurate predictive models. In this paper, we report on primary results on testing the hypothesis that a client can become a <i>defaulter</i> in two scenarios: intentionally and unintentionally. We verify our hypothesis applying data driven regularized classification using an autoencoder to client profiles. To model an intention as a hidden variable, we propose an especially designed regularizer for the autoencoder. The regularizer aims to obtain a representation of defaulters that includes a cluster of <i>intentional defaulters</i> and <i>unintentional defaulters</i> as outliers. The outliers were detected by our model and excluded from the dataset. This improved the credit scoring model and confirmed our hypothesis.https://www.mdpi.com/2227-9091/9/3/54clusteringautoencoderregularizationneural networksmachine learningcredit scoring
collection DOAJ
language English
format Article
sources DOAJ
author Andrey Filchenkov
Natalia Khanzhina
Arina Tsai
Ivan Smetannikov
spellingShingle Andrey Filchenkov
Natalia Khanzhina
Arina Tsai
Ivan Smetannikov
Regularization of Autoencoders for Bank Client Profiling Based on Financial Transactions
Risks
clustering
autoencoder
regularization
neural networks
machine learning
credit scoring
author_facet Andrey Filchenkov
Natalia Khanzhina
Arina Tsai
Ivan Smetannikov
author_sort Andrey Filchenkov
title Regularization of Autoencoders for Bank Client Profiling Based on Financial Transactions
title_short Regularization of Autoencoders for Bank Client Profiling Based on Financial Transactions
title_full Regularization of Autoencoders for Bank Client Profiling Based on Financial Transactions
title_fullStr Regularization of Autoencoders for Bank Client Profiling Based on Financial Transactions
title_full_unstemmed Regularization of Autoencoders for Bank Client Profiling Based on Financial Transactions
title_sort regularization of autoencoders for bank client profiling based on financial transactions
publisher MDPI AG
series Risks
issn 2227-9091
publishDate 2021-03-01
description Predicting if a client is worth giving a loan—credit scoring—is one of the most essential and popular problems in banking. Predictive models for this goal are built on the assumption that there is a dependency between the client’s profile before the loan approval and their future behavior. However, circumstances that cause changes in the client’s behavior may not depend on their will and cannot be predicted by their profile. Such clients may be considered “noisy” as their eventual belonging to the defaulters class results rather from random factors than from some predictable rules. Excluding such clients from the dataset may be helpful in building more accurate predictive models. In this paper, we report on primary results on testing the hypothesis that a client can become a <i>defaulter</i> in two scenarios: intentionally and unintentionally. We verify our hypothesis applying data driven regularized classification using an autoencoder to client profiles. To model an intention as a hidden variable, we propose an especially designed regularizer for the autoencoder. The regularizer aims to obtain a representation of defaulters that includes a cluster of <i>intentional defaulters</i> and <i>unintentional defaulters</i> as outliers. The outliers were detected by our model and excluded from the dataset. This improved the credit scoring model and confirmed our hypothesis.
topic clustering
autoencoder
regularization
neural networks
machine learning
credit scoring
url https://www.mdpi.com/2227-9091/9/3/54
work_keys_str_mv AT andreyfilchenkov regularizationofautoencodersforbankclientprofilingbasedonfinancialtransactions
AT nataliakhanzhina regularizationofautoencodersforbankclientprofilingbasedonfinancialtransactions
AT arinatsai regularizationofautoencodersforbankclientprofilingbasedonfinancialtransactions
AT ivansmetannikov regularizationofautoencodersforbankclientprofilingbasedonfinancialtransactions
_version_ 1724217814159458304