A Study of Applying a Modified Logistic Regression on Credit Scoring

碩士 === 國立臺北科技大學 === 資訊與財金管理系碩士班 === 104 === For finance institutes, credit scoring has become the important issue that banks used to assess whether customers may pass due delinquency or not. With the development of financial market liberalization and Internet services to flourish, we are in the fina...

Full description

Bibliographic Details
Main Authors: Yu-Fang Wu, 吳毓芳
Other Authors: Sung-Shun Weng
Format: Others
Language:zh-TW
Online Access:http://ndltd.ncl.edu.tw/handle/jx2rpa
Description
Summary:碩士 === 國立臺北科技大學 === 資訊與財金管理系碩士班 === 104 === For finance institutes, credit scoring has become the important issue that banks used to assess whether customers may pass due delinquency or not. With the development of financial market liberalization and Internet services to flourish, we are in the financially big data environment. How to use scientific methods to handle and analyze large amount of data have become a new issue faced by the banks. The study is to explore how to apply a modified logistic regression to solve the credit scoring problems. With the logistic regression method, we combine it with the stochastic gradient descent algorithm to reach the target function optimization. The consolidation method can help banks minimize the customers’ credit risks in a huge amount of data and construct an objective credit scoring model. In addition, the study also compared the logistic regression analysis in order to investigate the credit scoring models which were established by the preferred classification method. In the Hadoop cloud computing environment, we show that the application of modified logistic regression algorithm can effectively upgrade classification accuracy. Whether in the original attributes or the filter attributes, the proposed algorithm outperforms logistic regression. Both of them get accurate rate of 86% by credit scoring prediction models. Simultaneously, the modified logistic regression models are effective in reducing Type I and Type II errors. They have the lower cost in modeling time.