Constructing a Credit Risk Assessment Model using Synthetic Minority Over-sampling Technique

碩士 === 國立交通大學 === 工業工程與管理學系 === 100 === The main source of revenue of financial institutions is the interest they charge from their customers. But not all the customers will pay back their debt, financial institutions need to adopt some kind of risk assessment models in order to measure this credit...

Full description

Bibliographic Details
Main Authors: Yi-Hsien Lin, 林宜憲
Other Authors: 張永佳
Format: Others
Language:zh-TW
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/11786273799598686385
Description
Summary:碩士 === 國立交通大學 === 工業工程與管理學系 === 100 === The main source of revenue of financial institutions is the interest they charge from their customers. But not all the customers will pay back their debt, financial institutions need to adopt some kind of risk assessment models in order to measure this credit risk. It is not uncommon to observe class imbalance problem in finance risk data. Class imbalance problem is asymmetric categories within data, that is, there is one class of data (major class) significantly outnumbered others (minor class). If we trained a model with imbalanced data, while the accuracy of major class instances might be very well, it could have a poor predictive ability to identify minority instances. Most of the risk assessment models apply sampling to deal with the class imbalanced problem. However, sampling method may lead to lack of data integrity and the model is sensitive on the sampling result as to produce inaccurate problems. This study constructs a risk model using Synthetic Minority Over-sampling Technique (SMOTE) to tackle class imbalance problems. The model we proposed not only fixed the lack of data integrity, but also solved the poor minority class predictive ability issue, hence improved the overall model accuracy. In the end, the study compares the results of classification with several sampling methods and previous Granular Computing model. By calculation and compare of the accuracy, AUC and G-means, we can conclude that using Synthetic Minority Over-sampling Technique to construct risk models would have the same or even better result than sampling and Granular Computing model.