## 2019/06/14

### The Poisson estimator

Let's say you want to estimate a quantity $\mu$, but you have only access to unbiased estimates of its logarithm, i.e., $\log\mu$. Can you obtain an unbiased estimate of $\mu$?

There used to be a quite nice blog post online for the estimator solving this problem, but it is now in a weird shape, not compiling the Latex code. Recently I was discussing this estimator with friends and realized that, at the moment, there is no quick explanation of it on the web. So I decided to write one.

Let's say, as said in the blurb, you want to estimate a quantity but only have access to unbiased estimates of its $\log$. A typical situation is that you have a quantity of the form: \begin{align*} \mu(x) = \exp{\left(-\frac{1}{n} \sum_{i=1}^n g_i(x)\right)}. \end{align*} If $n$ is large, computing this might be prohibitive, but you can get cheap unbiased estimates of $\log \mu(x)$ by subsampling the functions and computing the average on a mini-batch. However, an unbiased estimate of $\log\mu$ is not enough to obtain an unbiased estimate of $\mu$. Let's denote this unbiased estimate $\widehat{\log\mu}$ and we naturally have $\bE[\widehat{\log\mu}] = \log \mu$. Next, note that \begin{align*} \bE\left[ e^{\widehat{\log\mu}} \right] \geq e^{\bE\left[ \widehat{\log\mu} \right]} = \mu, \end{align*} by Jensen's inequality. Therefore, naively computing the unbiased estimate of $\widehat{\log\mu}$ and exponentiating it will not produce the right result. Is there a way around?

It turns out that there is -- provided that you can obtain many i.i.d unbiased estimates. So returning to the original example, let's say you can obtain i.i.d random variables $\lambda_j$ where $\bE[\lambda_j] = \log \mu$. Then, incredibly, the following estimator is an unbiased estimator of $\mu$: \begin{align*} \widehat{\mu} = e^{\delta} \prod^J_{j=1} \frac{\lambda_j}{\delta}, \quad \text{where} \quad J \sim \textsf{Poisson}(J;\delta) \end{align*} where $\delta > 0$ is a parameter of the estimator. This estimator is called as the Poisson estimator.

Why though? Let's prove a quick and sketchy result. First note $J \sim \textsf{Poisson}(J; \delta)$ has the density: \begin{align*} P(J = k;\delta) = \frac{\delta^k e^{-\delta}}{k!}. \end{align*} Now if we take the expectation of $\widehat{\mu}$: \begin{align*} \widehat{\bE[ \widehat{\mu}]} &= \sum_{k = 0}^\infty P(J = k;\delta) e^{\delta} \prod^k_{j=1} \frac{\lambda_j}{\delta}, \\ &= \sum_{k = 0}^\infty \frac{\delta^k e^{-\delta}}{k!} e^{\delta} \prod^k_{j=1} \frac{\lambda_j}{\delta}, \\ &= 1 + \sum_{k = 1}^\infty \frac{1}{k!} \prod^k_{j=1} \lambda_j. \end{align*} Note that $\widehat{\bE[ \widehat{\mu}]}$ is still a random quantity (hence the hat), since $\lambda_j$ are unbiased and stochastic estimates of $\log\mu$. Therefore, taking another expectation, we now obtain: \begin{align*} \bE[ \widehat{\mu}] = 1 + \sum_{k = 1}^\infty \frac{\lambda^k}{k!}, \end{align*} where $\lambda = \bE[\lambda_j] = \log\mu$, since $\lambda_j$ are i.i.d. Now it suffices to observe that this is indeed the series representation of the exponential function, therefore: \begin{align*} \bE[ \widehat{\mu}] = 1 + \sum_{k = 1}^\infty \frac{\lambda^k}{k!} = \exp(\lambda) = \mu. \end{align*} Although this result is nice and produced some practical applications, this estimator exhibits a huge variance. A partial way around this is to introduce another variable $c \in \bR$ and define the estimator as: \begin{align} \widehat{\mu} = e^{\delta + c} \prod^J_{j=1} \frac{\lambda_j – c}{\delta}, \end{align} where $J \sim \textsf{Poisson}(J;\delta)$. You can check that this estimator is still unbiased. However, as mentioned in this blogpost, this estimator still exhibits a big variance, hence is not solving a lot of the problems we encounter in practice.