Summary: | 碩士 === 國立政治大學 === 金融學系 === 106 === This study uses traditional methods and machine learning methods to establish a default prediction model of loans on the P2P lending platform, and then compares the performance of various methods. This study uses the database published by Lending Club, which is the largest P2P lending platform in the United States. We first overview the research on P2P loan default factors in recent years, and inspect the correlation between different factors to determine the independent variables of logistic regression. We establish four logistic regression models based on input characteristics. In machine learning method, the neural network has four control variables, which are batch training, training times, hidden layers, neurons of hidden layer. We find the best hyper-parameter group for the network by controlling one or two variables each time. The optimal hyper-parameter combination is to set the activation function as tanh, the batch training amount as 70, the number of neurons of hidden layer as 8, and the hidden layer as 1 layer, and the times of training as 200 times at least. Finally, we compared the logistic regression model, the neural network model and the support vector machine model by doing statistical test and found that the prediction accuracy of the neural network model is significantly higher than the other two.
|