Automated construction of generalized additive neural networks for predictive data mining / Jan Valentine du Toit

In this thesis Generalized Additive Neural Networks (GANNs) are studied in the context of predictive Data Mining. A GANN is a novel neural network implementation of a Generalized Additive Model. Originally GANNs were constructed interactively by considering partial residual plots. This methodology i...

Full description

Bibliographic Details
Main Author: Du Toit, Jan Valentine
Published: North-West University 2008
Subjects:
AIC
GAM
SBC
Online Access:http://hdl.handle.net/10394/128
Description
Summary:In this thesis Generalized Additive Neural Networks (GANNs) are studied in the context of predictive Data Mining. A GANN is a novel neural network implementation of a Generalized Additive Model. Originally GANNs were constructed interactively by considering partial residual plots. This methodology involves subjective human judgment, is time consuming, and can result in suboptimal results. The newly developed automated construction algorithm solves these difficulties by performing model selection based on an objective model selection criterion. Partial residual plots are only utilized after the best model is found to gain insight into the relationships between inputs and the target. Models are organized in a search tree with a greedy search procedure that identifies good models in a relatively short time. The automated construction algorithm, implemented in the powerful SAS® language, is nontrivial, effective, and comparable to other model selection methodologies found in the literature. This implementation, which is called AutoGANN, has a simple, intuitive, and user-friendly interface. The AutoGANN system is further extended with an approximation to Bayesian Model Averaging. This technique accounts for uncertainty about the variables that must be included in the model and uncertainty about the model structure. Model averaging utilizes in-sample model selection criteria and creates a combined model with better predictive ability than using any single model. In the field of Credit Scoring, the standard theory of scorecard building is not tampered with, but a pre-processing step is introduced to arrive at a more accurate scorecard that discriminates better between good and bad applicants. The pre-processing step exploits GANN models to achieve significant reductions in marginal and cumulative bad rates. The time it takes to develop a scorecard may be reduced by utilizing the automated construction algorithm. === Thesis (Ph.D. (Computer Science))--North-West University, Potchefstroom Campus, 2006.