Final exam - Accelerating Convex Optimization in Machine Learning by Leveraging Functional Growth Conditions

May 7, 2019 - 2:30pm
2390 UCC

PhD Candidate: Yi Xu


In recent years, unprecedented growths in scale and dimensionality of data raise big computational challenges for traditional optimization algorithms; thus it becomes very important to develop efficient and effective optimization algorithms for solving numerous machine learning problems. Many traditional algorithms (e.g., gradient descent method) are black-box algorithms, which are simple to implement but ignore the underlying geometrical property of the objective function. Recent trend in accelerating these traditional black-box algorithms is to leverage geometrical properties of the objective function such as strong convexity. However, most existing methods rely too much on the knowledge of strong convexity, which makes them not applicable to problems without strong convexity or without knowledge of strong convexity. To bridge the gap between traditional black-box algorithms without knowing the problem’s geometrical property and accelerated algorithms under strong convexity, how can we develop adaptive algorithms that can be adaptive to the objective function’s underlying geometrical property? To answer this question, in this dissertation we focus on convex optimization problems and propose to explore an error bound condition that characterizes the functional growth condition of the objective function around a global minimum. 

We first considered stochastic optimization problems with general stochastic loss and proposed two accelerated stochastic subgradient (ASSG) methods with improved theoretical guarantees. Second, we developed a new theory of alternating direction method of multipliers (ADMM) with a new adaptive scheme of the penalty parameter for both deterministic and stochastic optimization problems with structured regularizers. With LEB condition, the proposed deterministic and stochastic ADMM enjoy improved iteration complexities of O (1/ε^{1−θ}) and O (1/ε^{2(1−θ)}) respectively. Then, we considered a family of optimization problems with an explicit max-structural loss. We developed a novel homotopy smoothing (HOPS) algorithm that employs Nesterov’s smoothing technique and accelerated gradient method and runs in stages. Under a generic condition so-called local error bound (LEB) condition, it can improve the iteration complexity of O(1/ε) to O (1/ε^{1−θ})  omitting a logarithmic factor with θ ∈ (0,1]. Next, we proposed new restarted stochastic primal-dual (RSPD) algorithms for solving the problem with stochastic explicit max-structural loss. We successfully got a better iteration complexity than O(1/ε^2) without bilinear structure assumption, which is a big challenge of obtaining faster convergence for the considered problem. Finally, we consider finite-sum optimization problems with smooth loss and simple regularizer. We proposed novel techniques to automatically search for the unknown parameter on the fly of optimization while maintaining almost the same convergence rate as an oracle setting assuming the involved parameter is given. 

Advisor: Tianbao Yang