Dual Free Adaptive Minibatch SDCA for Empirical Risk Minimization
In this paper we develop an adaptive dual free Stochastic Dual Coordinate Ascent (adfSDCA) algorithm for regularized empirical risk minimization problems. This is motivated by the recent work on dual free SDCA of Shalev-Shwartz [1]. The novelty of our approach is that the coordinates to update at ea...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2018-07-01
|
Series: | Frontiers in Applied Mathematics and Statistics |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fams.2018.00033/full |
id |
doaj-77814a7c16e249a789dc03c3d28af204 |
---|---|
record_format |
Article |
spelling |
doaj-77814a7c16e249a789dc03c3d28af2042020-11-25T02:08:31ZengFrontiers Media S.A.Frontiers in Applied Mathematics and Statistics2297-46872018-07-01410.3389/fams.2018.00033348053Dual Free Adaptive Minibatch SDCA for Empirical Risk MinimizationXi He0Rachael Tappenden1Martin Takáč2Industrial and Systems Engineering, Lehigh UniversityBethlehem, PA, United StatesSchool of Mathematics and Statistics, University of Canterbury, Christchurch, New ZealandIndustrial and Systems Engineering, Lehigh UniversityBethlehem, PA, United StatesIn this paper we develop an adaptive dual free Stochastic Dual Coordinate Ascent (adfSDCA) algorithm for regularized empirical risk minimization problems. This is motivated by the recent work on dual free SDCA of Shalev-Shwartz [1]. The novelty of our approach is that the coordinates to update at each iteration are selected non-uniformly from an adaptive probability distribution, and this extends the previously mentioned work which only allowed for a uniform selection of “dual” coordinates from a fixed probability distribution. We describe an efficient iterative procedure for generating the non-uniform samples, where the scheme selects the coordinate with the greatest potential to decrease the sub-optimality of the current iterate. We also propose a heuristic variant of adfSDCA that is more aggressive than the standard approach. Furthermore, in order to utilize multi-core machines we consider a mini-batch adfSDCA algorithm and develop complexity results that guarantee the algorithm's convergence. The work is concluded with several numerical experiments to demonstrate the practical benefits of the proposed approach.https://www.frontiersin.org/article/10.3389/fams.2018.00033/fullSDCAimportance samplingnon-uniform samplingmini-batchadaptive |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xi He Rachael Tappenden Martin Takáč |
spellingShingle |
Xi He Rachael Tappenden Martin Takáč Dual Free Adaptive Minibatch SDCA for Empirical Risk Minimization Frontiers in Applied Mathematics and Statistics SDCA importance sampling non-uniform sampling mini-batch adaptive |
author_facet |
Xi He Rachael Tappenden Martin Takáč |
author_sort |
Xi He |
title |
Dual Free Adaptive Minibatch SDCA for Empirical Risk Minimization |
title_short |
Dual Free Adaptive Minibatch SDCA for Empirical Risk Minimization |
title_full |
Dual Free Adaptive Minibatch SDCA for Empirical Risk Minimization |
title_fullStr |
Dual Free Adaptive Minibatch SDCA for Empirical Risk Minimization |
title_full_unstemmed |
Dual Free Adaptive Minibatch SDCA for Empirical Risk Minimization |
title_sort |
dual free adaptive minibatch sdca for empirical risk minimization |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Applied Mathematics and Statistics |
issn |
2297-4687 |
publishDate |
2018-07-01 |
description |
In this paper we develop an adaptive dual free Stochastic Dual Coordinate Ascent (adfSDCA) algorithm for regularized empirical risk minimization problems. This is motivated by the recent work on dual free SDCA of Shalev-Shwartz [1]. The novelty of our approach is that the coordinates to update at each iteration are selected non-uniformly from an adaptive probability distribution, and this extends the previously mentioned work which only allowed for a uniform selection of “dual” coordinates from a fixed probability distribution. We describe an efficient iterative procedure for generating the non-uniform samples, where the scheme selects the coordinate with the greatest potential to decrease the sub-optimality of the current iterate. We also propose a heuristic variant of adfSDCA that is more aggressive than the standard approach. Furthermore, in order to utilize multi-core machines we consider a mini-batch adfSDCA algorithm and develop complexity results that guarantee the algorithm's convergence. The work is concluded with several numerical experiments to demonstrate the practical benefits of the proposed approach. |
topic |
SDCA importance sampling non-uniform sampling mini-batch adaptive |
url |
https://www.frontiersin.org/article/10.3389/fams.2018.00033/full |
work_keys_str_mv |
AT xihe dualfreeadaptiveminibatchsdcaforempiricalriskminimization AT rachaeltappenden dualfreeadaptiveminibatchsdcaforempiricalriskminimization AT martintakac dualfreeadaptiveminibatchsdcaforempiricalriskminimization |
_version_ |
1724926818593538048 |