Gradient optimization algorithms with fast convergence

The most common class of methods for solving linear systems is the class of gradient algorithms, the most famous of which being the steepest descent (SD) algorithm. Linear systems equivalent to the quadratic optimization problems where the coefficient matrix of the linear system is the Hessian of th...

Full description

Bibliographic Details
Main Author: Fathi, Bayda
Published: Cardiff University 2013
Subjects:
518
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.577926
Description
Summary:The most common class of methods for solving linear systems is the class of gradient algorithms, the most famous of which being the steepest descent (SD) algorithm. Linear systems equivalent to the quadratic optimization problems where the coefficient matrix of the linear system is the Hessian of the corresponding quadratic function. In recent years, the development of a particular gradient algorithm, namely the Barzilai-Borwein (BB) algorithm, has instigated a lot of research in the area of optimization and many algorithms now exist which have faster rates of convergence than those possessed by the steepest descent algorithm. In this thesis, some well known gradient algorithms (which are originally developed for solving linear symmetric problems) are applied to non-symmetric systems of equations. Their efficiency is demonstrated through an implementation on a constructed a class of random slightly non-symmetric matrices E (with positive and real eigenvalues) as well as on the symmetric part of E, ES = 1 2 (E +ET ), the performance of the algorithm is then investigated. Also several gradient algorithms are created that have a better performance than the Cauchy- Barzilai-Borwein (CBB) method for a non-symmetric problem when applied to another constructed class of random slightly non-symmetric matrices A (having eigenvalues with a positive real part). Numerically, it is established that the asymptotic rate of convergence of these algorithms is faster when the roots of their polynomials have a Beta(2,2) distribution rather than a uniform distribution. In fact, there is a strong dependence of the asymptotic rate of convergence on the distribution of the spectrum, this has been proven from numerical results. Most of the created algorithms achieve faster convergence than other methods, especially with the non-clustered distribution of the spectrum. The modified Cauchy- Barzilai-Borwein (MCBB) algorithm is the first created algorithm which splits the two equal polynomial roots of the CBB algorithm into two different roots at each iteration. This means, moving the roots of the polynomial of CBB method from the real-axis to the complex plane where the eigenvalues of the objective matrix would lie. To attain further improvement on the convergence rates, another gradient algorithm is proposed by using theMCBB method or CBB method at each iteration depending on two parameters g and x . Furthermore, some different methods are then created by utilizing different step-sizes of gradient algorithms which already exist for finding the solution of symmetric problems. The effectiveness of the constructed algorithms, which depend on several parameters, have been verified by various numerical experiments and comparisons with other approaches. The motivation for choosing different gradient methods stems from the fact that simple examples can be constructed in which each of these gradient methods will show superiority over the other. Most of these algorithms have faster rates of convergence than CBB for non-symmetric matrices but slower for symmetric matrices. A new efficient gradient-based method to solve non-symmetric linear systems of equations is presented. The simplicity of the new method, guaranteed convergent property, along with the miniii imum memory requirement where it has almost the same number of gradient evaluations as the CBB method per iteration. This improved method is very attractive compared to alternative methods for solving slightly non-symmetric linear problems. Efficiency and feasibility of the presented method are supported by numerical experiments.