Frameworks for High Dimensional Convex Optimization

<p>We present novel, efficient algorithms for solving extremely large optimization problems. A significant bottleneck today is that as the size of datasets grow, researchers across disciplines desire to solve prohibitively massive optimization problems. In this thesis, we present methods to co...

Full description

Bibliographic Details
Main Author: London, Palma Alise den Nijs
Format: Others
Language:en
Published: 2021
Online Access:https://thesis.library.caltech.edu/13856/1/london_palma_2020.pdf
London, Palma Alise den Nijs (2021) Frameworks for High Dimensional Convex Optimization. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/db29-am33. https://resolver.caltech.edu/CaltechTHESIS:08162020-233437139 <https://resolver.caltech.edu/CaltechTHESIS:08162020-233437139>
Description
Summary:<p>We present novel, efficient algorithms for solving extremely large optimization problems. A significant bottleneck today is that as the size of datasets grow, researchers across disciplines desire to solve prohibitively massive optimization problems. In this thesis, we present methods to compress optimization problems. The general goal is to represent a huge problem as a smaller problem or set of smaller problems, while still retaining enough information to ensure provable guarantees on solution quality and run time. We apply this approach to the following three settings.</p> <p>First, we propose a framework for accelerating both linear program solvers and convex solvers for problems with linear constraints. Our focus is on a class of problems for which data is either very costly, or hard to obtain. In these situations, the number of data points m available is much smaller than the number of variables, n. In a machine learning setting, this regime is increasingly prevalent since it is often advantageous to consider larger and larger feature spaces, while not necessarily obtaining proportionally more data. Analytically, we provide worst-case guarantees on both the runtime and the quality of the solution produced. Empirically, we show that our framework speeds up state-of-the-art commercial solvers by two orders of magnitude, while maintaining a near-optimal solution.</p> <p>Second, we propose a novel approach for distributed optimization which uses far fewer messages than existing methods. We consider a setting in which the problem data are distributed over the nodes. We provide worst-case guarantees on the performance with respect to the amount of communication it requires and the quality of the solution. The algorithm uses O(log(n+m)) messages with high probability. We note that this is an exponential reduction compared to the O(n) communication required during each round of traditional consensus based approaches. In terms of solution quality, our algorithm produces a feasible, near optimal solution. Numeric results demonstrate that the approximation error matches that of ADMM in many cases, while using orders-of-magnitude less communication.</p> <p>Lastly, we propose and analyze a provably accurate long-step infeasible Interior Point Algorithm (IPM) for linear programming. The core computational bottleneck in IPMs is the need to solve a linear system of equations at each iteration. We employ sketching techniques to make the linear system computation lighter, by handling well-known ill-conditioning problems that occur when using iterative solvers in IPMs for LPs. In particular, we propose a preconditioned Conjugate Gradient iterative solver for the linear system. Our sketching strategy makes the condition number of the preconditioned system provably small. In practice we demonstrate that our approach significantly reduces the condition number of the linear system, and thus allows for more efficient solving on a range of benchmark datasets.</p>