Convergence rate of gradient descent for convex functions

Suppose, given a convex function $f: \bR^d \to \bR$, we would like to find the minimum of $f$ by iterating \begin{align*} \theta_t = \theta_{t-1} - \gamma \nabla f(\theta_{t-1}). \end{align*} How fast do we converge to the minima of $f$?


The Poisson estimator

Let's say you want to estimate a quantity $\mu$, but you have only access to unbiased estimates of its logarithm, i.e., $\log\mu$. Can you obtain an unbiased estimate of $\mu$?


A primer on filtering

Say that you have a dynamical process of interest $X_1,\ldots,X_n$ and you can only observe the process with some noise, i.e., you get an observation sequence $Y_1,\ldots,Y_n$. What is the optimal way to estimate $X_n$ conditioned on the whole sequence of observations $Y_{1:n}$?


A simple bound for optimisation using a grid

If I give you a function on $[0,1]$ and a computer and want you to find the minimum, what would you do? Since you have the computer, you can be lazy: Just compute a grid on $[0,1]$, evaluate the grid points and take the minimum. This will give you something close to the true minimum. But how much?


An $L_2$ bound for Perfect Monte Carlo

Suppose that you sample from a probability measure $\pi$ to estimate the expectation $\pi(f) := \int f(x) \pi(\mbox{d}x)$ and formed an estimate $\pi^N(f)$. How close are you to the true expectation $\pi(f)$?


Tinkering around logistic map

I was tinkering around logistic map $x_{n+1} = a x_n (1 - x_n)$ today and I wondered what happens if I plot the histogram of the generated sequence $(x_n)_{n\geq 0}$. Can it possess some statistical properties?


Monte Carlo as Intuition

Suppose we have a continuous random variable $X \sim p(x)$ and we would like to estimate its tail probability, i.e. the probability of the event $\{X \geq t\}$ for some $t \in \mathbb{R}$. What is the most intuitive way to do this?


Fisher's Identity

Fisher's identity is useful to use in maximum-likelihood parameter estimation problems. In this post, I give its proof. The main reference is Douc, Moulines, Stoffer; Nonlinear time series theory, methods and applications.


Batch MLE for the GARCH(1,1) model


In this post, we derive the batch MLE procedure for the GARCH model in a more principled way than the last GARCH post. The derivation presented here is simple and concise.


Fatou's lemma and monotone convergence theorem

In this post, we deduce Fatou's lemma and monotone convergence theorem (MCT) from each other.