–
September 13, 2018
Abstract: At one point or another, most of us have had (or will have) an encounter with a continuous (read: non-combinatorial) optimization problem, and we'd like to have an algorithm to solve it. For the fortunate ones, this problem is convex (or close enough); for the less fortunate ones, it isn't. This talk is intended to serve as a cheat sheet for the lucky ones.
The source of good fortune in convex optimization is the popular adage that "gradient descent always converges". We will start by dispelling this myth and showing that the accompanying fine print actually matters. We will discuss about the convergence rate (read: runtime) of gradient descent algorithms in various classes of problems, their bottlenecks, when they break down, and what to replace them with when they do. If there is time, we will also briefly talk about what to do with memory issues that arise in high-dimensional problems, and what are the associated trade-offs.
This talk is not intended to be exhaustive, comprehensive (or even serious), and will probably contain several anecdotes of personal frustration. You have been warned.