Suppose, given a convex function $f: \bR^d \to \bR$, we would like to find the minimum of $f$ by iterating \begin{align*} \theta_t = \theta_{t-1} - \gamma \nabla f(\theta_{t-1}). \end{align*} How fast do we converge to the minima of $f$?
on probability and optimization
Suppose, given a convex function $f: \bR^d \to \bR$, we would like to find the minimum of $f$ by iterating \begin{align*} \theta_t = \theta_{t-1} - \gamma \nabla f(\theta_{t-1}). \end{align*} How fast do we converge to the minima of $f$?