A Mathematical Framework on Machine Learning: Theory and Application

The dissertation addresses the research topics of machine learning outlined below. We developed the theory about traditional first-order algorithms from convex opti- mization and provide new insights in nonconvex objective functions from machine learning. Based on the theory analysis, we designed an...

Full description

Bibliographic Details
Main Author:	Shi, Bin
Format:	Others
Published:	FIU Digital Commons 2018
Subjects:	artificial intelligence robotics numerical analysis computation operational research ordinary differential equations theory algorithms Artificial Intelligence and Robotics Numerical Analysis and Computation Operational Research Ordinary Differential Equations and Applied Dynamics Theory and Algorithms
Online Access:	https://digitalcommons.fiu.edu/etd/3876 https://digitalcommons.fiu.edu/cgi/viewcontent.cgi?article=5199&context=etd

id	ndltd-fiu.edu-oai-digitalcommons.fiu.edu-etd-5199
record_format	oai_dc
spelling	ndltd-fiu.edu-oai-digitalcommons.fiu.edu-etd-51992020-01-11T03:07:17Z A Mathematical Framework on Machine Learning: Theory and Application Shi, Bin The dissertation addresses the research topics of machine learning outlined below. We developed the theory about traditional first-order algorithms from convex opti- mization and provide new insights in nonconvex objective functions from machine learning. Based on the theory analysis, we designed and developed new algorithms to overcome the difficulty of nonconvex objective and to accelerate the speed to obtain the desired result. In this thesis, we answer the two questions: (1) How to design a step size for gradient descent with random initialization? (2) Can we accelerate the current convex optimization algorithms and improve them into nonconvex objective? For application, we apply the optimization algorithms in sparse subspace clustering. A new algorithm, CoCoSSC, is proposed to improve the current sample complexity under the condition of the existence of noise and missing entries. Gradient-based optimization methods have been increasingly modeled and inter- preted by ordinary differential equations (ODEs). Existing ODEs in the literature are, however, inadequate to distinguish between two fundamentally different meth- ods, Nesterov’s acceleration gradient method for strongly convex functions (NAG-SC) and Polyak’s heavy-ball method. In this paper, we derive high-resolution ODEs as more accurate surrogates for the two methods in addition to Nesterov’s acceleration gradient method for general convex functions (NAG-C), respectively. These novel ODEs can be integrated into a general framework that allows for a fine-grained anal- ysis of the discrete optimization algorithms through translating properties of the amenable ODEs into those of their discrete counterparts. As a first application of this framework, we identify the effect of a term referred to as gradient correction in NAG-SC but not in the heavy-ball method, shedding deep insight into why the for- mer achieves acceleration while the latter does not. Moreover, in this high-resolution ODE framework, NAG-C is shown to boost the squared gradient norm minimization at the inverse cubic rate, which is the sharpest known rate concerning NAG-C itself. Finally, by modifying the high-resolution ODE of NAG-C, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates as NAG-C for minimizing convex functions. 2018-11-01T07:00:00Z text application/pdf https://digitalcommons.fiu.edu/etd/3876 https://digitalcommons.fiu.edu/cgi/viewcontent.cgi?article=5199&context=etd FIU Electronic Theses and Dissertations FIU Digital Commons artificial intelligence robotics numerical analysis computation operational research ordinary differential equations theory algorithms Artificial Intelligence and Robotics Numerical Analysis and Computation Operational Research Ordinary Differential Equations and Applied Dynamics Theory and Algorithms
collection	NDLTD
format	Others
sources	NDLTD
topic	artificial intelligence robotics numerical analysis computation operational research ordinary differential equations theory algorithms Artificial Intelligence and Robotics Numerical Analysis and Computation Operational Research Ordinary Differential Equations and Applied Dynamics Theory and Algorithms
spellingShingle	artificial intelligence robotics numerical analysis computation operational research ordinary differential equations theory algorithms Artificial Intelligence and Robotics Numerical Analysis and Computation Operational Research Ordinary Differential Equations and Applied Dynamics Theory and Algorithms Shi, Bin A Mathematical Framework on Machine Learning: Theory and Application
description	The dissertation addresses the research topics of machine learning outlined below. We developed the theory about traditional first-order algorithms from convex opti- mization and provide new insights in nonconvex objective functions from machine learning. Based on the theory analysis, we designed and developed new algorithms to overcome the difficulty of nonconvex objective and to accelerate the speed to obtain the desired result. In this thesis, we answer the two questions: (1) How to design a step size for gradient descent with random initialization? (2) Can we accelerate the current convex optimization algorithms and improve them into nonconvex objective? For application, we apply the optimization algorithms in sparse subspace clustering. A new algorithm, CoCoSSC, is proposed to improve the current sample complexity under the condition of the existence of noise and missing entries. Gradient-based optimization methods have been increasingly modeled and inter- preted by ordinary differential equations (ODEs). Existing ODEs in the literature are, however, inadequate to distinguish between two fundamentally different meth- ods, Nesterov’s acceleration gradient method for strongly convex functions (NAG-SC) and Polyak’s heavy-ball method. In this paper, we derive high-resolution ODEs as more accurate surrogates for the two methods in addition to Nesterov’s acceleration gradient method for general convex functions (NAG-C), respectively. These novel ODEs can be integrated into a general framework that allows for a fine-grained anal- ysis of the discrete optimization algorithms through translating properties of the amenable ODEs into those of their discrete counterparts. As a first application of this framework, we identify the effect of a term referred to as gradient correction in NAG-SC but not in the heavy-ball method, shedding deep insight into why the for- mer achieves acceleration while the latter does not. Moreover, in this high-resolution ODE framework, NAG-C is shown to boost the squared gradient norm minimization at the inverse cubic rate, which is the sharpest known rate concerning NAG-C itself. Finally, by modifying the high-resolution ODE of NAG-C, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates as NAG-C for minimizing convex functions.
author	Shi, Bin
author_facet	Shi, Bin
author_sort	Shi, Bin
title	A Mathematical Framework on Machine Learning: Theory and Application
title_short	A Mathematical Framework on Machine Learning: Theory and Application
title_full	A Mathematical Framework on Machine Learning: Theory and Application
title_fullStr	A Mathematical Framework on Machine Learning: Theory and Application
title_full_unstemmed	A Mathematical Framework on Machine Learning: Theory and Application
title_sort	mathematical framework on machine learning: theory and application
publisher	FIU Digital Commons
publishDate	2018
url	https://digitalcommons.fiu.edu/etd/3876 https://digitalcommons.fiu.edu/cgi/viewcontent.cgi?article=5199&context=etd
work_keys_str_mv	AT shibin amathematicalframeworkonmachinelearningtheoryandapplication AT shibin mathematicalframeworkonmachinelearningtheoryandapplication
_version_	1719307974605799424

A Mathematical Framework on Machine Learning: Theory and Application

Similar Items