Introduction
Importance sampling
Suppose we have a random variable $X$ with density $\pi(x)$ and we wish to estimate its moments of the form,
\begin{align}
\bE_\pi(f(X)) = \int f(x) \pi(x) dx
\end{align}
Suppose also that, drawing samples from this density is impossible. In importance sampling, our aim is to use some another tractable density, say $q(x)$, to estimate the expectations of a target density $\pi(x)$.
We assume that, we can draw samples from $q(x)$ easily. Then, to estimate the expectations of $\pi$, we use the following trick,
\begin{align*}
\bE_\pi(f(X)) &= \int f(x) \pi(x) dx \\
&=\int f(x) \frac{\pi(x)}{q(x)} q(x) dx
\end{align*}
For theoretical issues, we assume some absolute continuity properties. Also, note that, $q(x)$ should capture the support of $\pi(x)$. We put,
\begin{align*}
W(x) = \frac{\pi(x)}{q(x)}
\end{align*}
$W(x)$ is called weight function. Therefore, we write,
\begin{align*}
\bE_\pi(f(X)) &= \int f(x) \frac{\pi(x)}{q(x)} q(x) dx \\
&= \bE_q( f(X) W(X))
\end{align*}
Say, we have $N$ samples $\{x^{(i)}\}_{i=1}^N$ such that $x^{(i)} \sim q(x)$. Then, we can estimate the expectation as follows,
\begin{align*}
\bE_q( f(X) W(X)) \approx \frac{1}{N} \sum_{i=1}^N f(x^{(i)}) W(x^{(i)})
\end{align*}
Application to Bayesian statistics
In Bayesian statistics, a problem of interest is to estimate the normalizing constant of some unnormalized probability distribution. Importance sampling is useful for this purpose.
Suppose we have,
\begin{align}
\pi(x) = \frac{1}{Z} \varphi(x)
\end{align}
then, $Z = \int \varphi(x) dx$. Suppose again, we wish to estimate the expectations of the form $\bE_\pi(f(X))$. Hence,
\begin{align*}
\bE_\pi(f(X)) &= \frac{1}{Z} \int f(x) \varphi(x) dx \\
&= \frac{1}{Z} \int f(x) \frac{\varphi(x)}{q(x)} q(x) dx \\
&= \frac{1}{Z} \bE_q(f(X) W(X))
\end{align*}
We can also apply this trick for $Z$,
\begin{align*}
Z &= \int \varphi(x) dx \\
& = \int \frac{\varphi(x)}{q(x)} q(x) dx \\
& = \bE_q (W(X))
\end{align*}
Then the expectation $\bE_\pi(f(X))$ becomes,
\begin{align}
\bE_\pi(f(X)) = \frac{\bE_q(f(X) W(X))}{\bE_q (W(X))}
\end{align}
Since we can draw $N$ samples $\{x^{(i)}\}_{i=1}^N$ such that $x^{(i)} \sim q(x)$, we can also estimate this expectation as,
\begin{align}
\bE_\pi(f(X)) &\approx \frac{\frac{1}{N} \sum_{i=1}^N f(x^{(i)}) W(x^{(i)})}{\frac{1}{N} \sum_{j=1}^N W(x^{(j)})} \\
& = \frac{\sum_{i=1}^N f(x^{(i)}) W(x^{(i)})}{\sum_{j=1}^N W(x^{(j)})} \nonumber
\end{align}
The notion of normalised weights is heavily used in the literature. We noted that, $W(x)$ is called as weight function, however in this particular sense, it is called unnormalised weights. To normalise it, we do the following,
\begin{align}
\bE_\pi(f(X)) &\approx \frac{f(x^{(1)}) W(x^{(1)}) + f(x^{(2)}) W(x^{(2)}) + \cdots + f(x^{(N)}) W(x^{(N)})}{\sum_{j=1}^N W(x^{(j)})} \nonumber \\
&= f(x^{(1)}) \frac{W(x^{(1)})}{\sum_{j=1}^N W(x^{(j)})} + f(x^{(2)}) \frac{W(x^{(2)})}{\sum_{j=1}^N W(x^{(j)})} + \cdots \nonumber \\
&+ f(x^{(N)}) \frac{W(x^{(N)})}{\sum_{j=1}^N W(x^{(j)})}
\end{align}
We put,
\begin{align*}
w^{(i)} = \frac{W(x^{(i)})}{\sum_{j=1}^N W(x^{(j)})}
\end{align*}
then the estimate of the expectation becomes,
\begin{align}
\bE_\pi(f(X)) &\approx \sum_{i = 1}^N w^{(i)} f(x^{(i)})
\end{align}
here $w^{(i)}$ is called as normalised importance weights.
0 comments:
Post a Comment