In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates

In theory, recent results in nonparametric regression show that neural network estimates are able to achieve good rates of convergence provided suitable assumptions on the structure of the regression function are imposed. However, these theoretical analyses cannot explain the practical success of ne...

Full description

Bibliographic Details
Main Author:	Braun, Alina
Format:	Others
Language:	en
Published:	2021
Online Access:	https://tuprints.ulb.tu-darmstadt.de/19052/1/Dissertation_BraunAlina_genehmigt.pdf Braun, Alina <http://tuprints.ulb.tu-darmstadt.de/view/person/Braun=3AAlina=3A=3A.html> (2021):In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates. (Publisher's Version)Darmstadt, Technische Universität, DOI: 10.26083/tuprints-00019052 <https://doi.org/10.26083/tuprints-00019052>, [Ph.D. Thesis]

id	ndltd-tu-darmstadt.de-oai-tuprints.ulb.tu-darmstadt.de-19052
record_format	oai_dc
spelling	ndltd-tu-darmstadt.de-oai-tuprints.ulb.tu-darmstadt.de-190522021-08-12T05:13:40Z http://tuprints.ulb.tu-darmstadt.de/19052/ In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates Braun, Alina In theory, recent results in nonparametric regression show that neural network estimates are able to achieve good rates of convergence provided suitable assumptions on the structure of the regression function are imposed. However, these theoretical analyses cannot explain the practical success of neural networks since the theoretically studied estimates are defined by minimizing the empirical L_2 risk over a class of neural networks and in practice, solving this kind of minimization problem is not feasible. Consequently, the neural networks examined in theory cannot be implemented as they are defined. This means that neural network in applications differ from the ones that are analyzed theoretically. In this thesis we narrow the gap between theory and practice. We deal with neural network regression estimates for (p,C)-smooth regression functions m that satisfy a projection pursuit model. We construct three implementable neural network estimates and show that each of them achieve up to a logarithmic factor the optimal univariate rate of convergence. Firstly, for univariate regression functions with p contained in [-1/2,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The starting weights are randomly chosen from an interval independently of the data. The interval is large enough to guarantee that the estimate is close to a piecewise constant approximation. Secondly, for multivariate regression functions with p contained in (0,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The initial weights are chosen from specific intervals dependently on the data and the projection directions. This choice guarantees that the estimate is close to a piecewise constant approximation. The projection directions are repeatedly chosen randomly. Lastly, for multivariate regression functions with p>0 we construct a multilayer neural network estimate. The value of the inner weights are prescribed dependently on the projection directions by a new approximation result for a projection pursuit model by piecewise polynomials. The outer weights are chosen by solving a linear equation system. The projection directions are repeatedly chosen randomly. Since we are able to show a rate of convergence that is independent of the dimension of the data our second and third estimates are able to circumvent the curse of dimensionality. 2021 Ph.D. Thesis NonPeerReviewed text CC-BY-NC-ND 4.0 International - Creative Commons, Attribution Non-commerical, No-derivatives https://tuprints.ulb.tu-darmstadt.de/19052/1/Dissertation_BraunAlina_genehmigt.pdf Braun, Alina <http://tuprints.ulb.tu-darmstadt.de/view/person/Braun=3AAlina=3A=3A.html> (2021):In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates. (Publisher's Version)Darmstadt, Technische Universität, DOI: 10.26083/tuprints-00019052 <https://doi.org/10.26083/tuprints-00019052>, [Ph.D. Thesis] https://doi.org/10.26083/tuprints-00019052 en info:eu-repo/semantics/doctoralThesis info:eu-repo/semantics/openAccess
collection	NDLTD
language	en
format	Others
sources	NDLTD
description	In theory, recent results in nonparametric regression show that neural network estimates are able to achieve good rates of convergence provided suitable assumptions on the structure of the regression function are imposed. However, these theoretical analyses cannot explain the practical success of neural networks since the theoretically studied estimates are defined by minimizing the empirical L_2 risk over a class of neural networks and in practice, solving this kind of minimization problem is not feasible. Consequently, the neural networks examined in theory cannot be implemented as they are defined. This means that neural network in applications differ from the ones that are analyzed theoretically. In this thesis we narrow the gap between theory and practice. We deal with neural network regression estimates for (p,C)-smooth regression functions m that satisfy a projection pursuit model. We construct three implementable neural network estimates and show that each of them achieve up to a logarithmic factor the optimal univariate rate of convergence. Firstly, for univariate regression functions with p contained in [-1/2,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The starting weights are randomly chosen from an interval independently of the data. The interval is large enough to guarantee that the estimate is close to a piecewise constant approximation. Secondly, for multivariate regression functions with p contained in (0,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The initial weights are chosen from specific intervals dependently on the data and the projection directions. This choice guarantees that the estimate is close to a piecewise constant approximation. The projection directions are repeatedly chosen randomly. Lastly, for multivariate regression functions with p>0 we construct a multilayer neural network estimate. The value of the inner weights are prescribed dependently on the projection directions by a new approximation result for a projection pursuit model by piecewise polynomials. The outer weights are chosen by solving a linear equation system. The projection directions are repeatedly chosen randomly. Since we are able to show a rate of convergence that is independent of the dimension of the data our second and third estimates are able to circumvent the curse of dimensionality.
author	Braun, Alina
spellingShingle	Braun, Alina In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
author_facet	Braun, Alina
author_sort	Braun, Alina
title	In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_short	In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_full	In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_fullStr	In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_full_unstemmed	In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates
title_sort	in theory and practice - on the rate of convergence of implementable neural network regression estimates
publishDate	2021
url	https://tuprints.ulb.tu-darmstadt.de/19052/1/Dissertation_BraunAlina_genehmigt.pdf Braun, Alina <http://tuprints.ulb.tu-darmstadt.de/view/person/Braun=3AAlina=3A=3A.html> (2021):In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates. (Publisher's Version)Darmstadt, Technische Universität, DOI: 10.26083/tuprints-00019052 <https://doi.org/10.26083/tuprints-00019052>, [Ph.D. Thesis]
work_keys_str_mv	AT braunalina intheoryandpracticeontherateofconvergenceofimplementableneuralnetworkregressionestimates
_version_	1719459833258704896

In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates

Similar Items