In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates

In theory, recent results in nonparametric regression show that neural network estimates are able to achieve good rates of convergence provided suitable assumptions on the structure of the regression function are imposed. However, these theoretical analyses cannot explain the practical success of ne...

Full description

Bibliographic Details
Main Author: Braun, Alina
Format: Others
Language:en
Published: 2021
Online Access:https://tuprints.ulb.tu-darmstadt.de/19052/1/Dissertation_BraunAlina_genehmigt.pdf
Braun, Alina <http://tuprints.ulb.tu-darmstadt.de/view/person/Braun=3AAlina=3A=3A.html> (2021):In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates. (Publisher's Version)Darmstadt, Technische Universität, DOI: 10.26083/tuprints-00019052 <https://doi.org/10.26083/tuprints-00019052>, [Ph.D. Thesis]
id ndltd-tu-darmstadt.de-oai-tuprints.ulb.tu-darmstadt.de-19052
record_format oai_dc
spelling ndltd-tu-darmstadt.de-oai-tuprints.ulb.tu-darmstadt.de-190522021-08-12T05:13:40Z http://tuprints.ulb.tu-darmstadt.de/19052/ In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates Braun, Alina In theory, recent results in nonparametric regression show that neural network estimates are able to achieve good rates of convergence provided suitable assumptions on the structure of the regression function are imposed. However, these theoretical analyses cannot explain the practical success of neural networks since the theoretically studied estimates are defined by minimizing the empirical L_2 risk over a class of neural networks and in practice, solving this kind of minimization problem is not feasible. Consequently, the neural networks examined in theory cannot be implemented as they are defined. This means that neural network in applications differ from the ones that are analyzed theoretically. In this thesis we narrow the gap between theory and practice. We deal with neural network regression estimates for (p,C)-smooth regression functions m that satisfy a projection pursuit model. We construct three implementable neural network estimates and show that each of them achieve up to a logarithmic factor the optimal univariate rate of convergence. Firstly, for univariate regression functions with p contained in [-1/2,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The starting weights are randomly chosen from an interval independently of the data. The interval is large enough to guarantee that the estimate is close to a piecewise constant approximation. Secondly, for multivariate regression functions with p contained in (0,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The initial weights are chosen from specific intervals dependently on the data and the projection directions. This choice guarantees that the estimate is close to a piecewise constant approximation. The projection directions are repeatedly chosen randomly. Lastly, for multivariate regression functions with p>0 we construct a multilayer neural network estimate. The value of the inner weights are prescribed dependently on the projection directions by a new approximation result for a projection pursuit model by piecewise polynomials. The outer weights are chosen by solving a linear equation system. The projection directions are repeatedly chosen randomly. Since we are able to show a rate of convergence that is independent of the dimension of the data our second and third estimates are able to circumvent the curse of dimensionality. 2021 Ph.D. Thesis NonPeerReviewed text CC-BY-NC-ND 4.0 International - Creative Commons, Attribution Non-commerical, No-derivatives https://tuprints.ulb.tu-darmstadt.de/19052/1/Dissertation_BraunAlina_genehmigt.pdf Braun, Alina <http://tuprints.ulb.tu-darmstadt.de/view/person/Braun=3AAlina=3A=3A.html> (2021):In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates. (Publisher's Version)Darmstadt, Technische Universität, DOI: 10.26083/tuprints-00019052 <https://doi.org/10.26083/tuprints-00019052>, [Ph.D. Thesis] https://doi.org/10.26083/tuprints-00019052 en info:eu-repo/semantics/doctoralThesis info:eu-repo/semantics/openAccess
collection NDLTD
language en
format Others
sources NDLTD
description In theory, recent results in nonparametric regression show that neural network estimates are able to achieve good rates of convergence provided suitable assumptions on the structure of the regression function are imposed. However, these theoretical analyses cannot explain the practical success of neural networks since the theoretically studied estimates are defined by minimizing the empirical L_2 risk over a class of neural networks and in practice, solving this kind of minimization problem is not feasible. Consequently, the neural networks examined in theory cannot be implemented as they are defined. This means that neural network in applications differ from the ones that are analyzed theoretically. In this thesis we narrow the gap between theory and practice. We deal with neural network regression estimates for (p,C)-smooth regression functions m that satisfy a projection pursuit model. We construct three implementable neural network estimates and show that each of them achieve up to a logarithmic factor the optimal univariate rate of convergence. Firstly, for univariate regression functions with p contained in [-1/2,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The starting weights are randomly chosen from an interval independently of the data. The interval is large enough to guarantee that the estimate is close to a piecewise constant approximation. Secondly, for multivariate regression functions with p contained in (0,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The initial weights are chosen from specific intervals dependently on the data and the projection directions. This choice guarantees that the estimate is close to a piecewise constant approximation. The projection directions are repeatedly chosen randomly. Lastly, for multivariate regression functions with p>0 we construct a multilayer neural network estimate. The value of the inner weights are prescribed dependently on the projection directions by a new approximation result for a projection pursuit model by piecewise polynomials. The outer weights are chosen by solving a linear equation system. The projection directions are repeatedly chosen randomly. Since we are able to show a rate of convergence that is independent of the dimension of the data our second and third estimates are able to circumvent the curse of dimensionality.
author Braun, Alina
spellingShingle Braun, Alina
In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
author_facet Braun, Alina
author_sort Braun, Alina
title In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_short In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_full In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_fullStr In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_full_unstemmed In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_sort in theory and practice - on the rate of convergence of implementable neural network regression estimates
publishDate 2021
url https://tuprints.ulb.tu-darmstadt.de/19052/1/Dissertation_BraunAlina_genehmigt.pdf
Braun, Alina <http://tuprints.ulb.tu-darmstadt.de/view/person/Braun=3AAlina=3A=3A.html> (2021):In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates. (Publisher's Version)Darmstadt, Technische Universität, DOI: 10.26083/tuprints-00019052 <https://doi.org/10.26083/tuprints-00019052>, [Ph.D. Thesis]
work_keys_str_mv AT braunalina intheoryandpracticeontherateofconvergenceofimplementableneuralnetworkregressionestimates
_version_ 1719459833258704896