The following Metropolis-Hastings exercise was a homework of a course I assisted.

In this post, we will design an independent MH sampler to localise a changepoint in a relatively short time series for tutorial purposes.

Suppose we are given a dataset consists of $T = 50$ points, i.e. a time-series. See the following figure.

There exists a changepoint in the dataset where characteristics of the time-series changes. To characterise this concretely, let us define the changepoint $m \in \{1,\ldots,T\}$ and two intensities for two regimes $\lambda_1, \lambda_2 \in \mathbb{R}_+$. The generative model of the dataset is,
\begin{align*}
p(x_{1:T}|m,\lambda_1,\lambda_2) = \prod_{k=1}^T \mathcal{PO}(x_k; \lambda_1)^{[k < m]} \mathcal{PO}(x_k;\lambda_2)^{[k \geq m]}
\end{align*}
We can interpret the generative model as follows. For time $k < m$, the distribution of the datapoint is simply $X_k \sim \mathcal{PO}(x_k;\lambda_1)$. Otherwise, $X_k \sim \mathcal{PO}(x_k;\lambda_2)$.

## A Metropolis-Hastings sampler for inference

For simplicity, we will assume all parameters have flat priors. In other words, $m,\lambda_1,\lambda_2$ have flat priors which, in turn, allows us to make the following crucial observation:
\begin{align*}
p(\lambda_1,\lambda_2,m|x_{1:T}) \propto p(x_{1:T}|\lambda_1,\lambda_2,m)
\end{align*}
Let us set $\theta = (\lambda_1,\lambda_2,m)$. The MH sampler aims to draw samples $\theta^{(\tau)}$ from the posterior of parameters. Let us for now denote the current sample with $\theta$ and the proposed sample with $\theta'$. We have to compute an acceptance probability to whether accept or reject a sample as,
\begin{align*}
\alpha(\theta \to \theta') = \min\left(1, \frac{p(\theta')}{p(\theta)} \frac{q(\theta|\theta')}{q(\theta'|\theta)}\right)
\end{align*}
What is $p(\theta)$? It is the posterior of parameters. To compute the first ratio ($p(\theta')/p(\theta)$), remember the observation we made using the flat priors.

Once we have $\theta$, we have to propose $\theta'$. Here we'll design an independent MH sampler:
\begin{align*}
q(m',\lambda_1',\lambda_2' | m,\lambda_1,\lambda_2) = q_m(m'|m) q_1(\lambda_1) q_2(\lambda_2)
\end{align*}
We can choose $q_m$. A useful one is to take,
\begin{align*}
q_m(m'|m) = \left\{
\begin{array}{ll}
m' = m+1 & \mbox{with probability 0.5}\\
m' = m-1 & \mbox{with probability 0.5}
\end{array}
\right.
\end{align*}
Since this proposal is symmetric (i.e. it is equally like to move $m' \to m$ or $m \to m'$), it will be cancelled out in the acceptance ratio. For boundaries ($m = 50$ or $m = 1$) assume a circular structure: If we go backwards from $m = 1$ (to $m' = m - 1$) we will arrive $m' = 50$. If we go forwards from $m=50$ (to $m' = m+1$), we will arrive $m'=1$. For $\lambda_1$ and $\lambda_2$, use $\mathcal{E}(\rho)$ (Exponential distribution with parameter $\rho$) with $\rho = 5$.

## Implementation

The above dataset was simulated with $\lambda_1 = 3$ and $\lambda_2 = 7$, and the location of the changepoint, i.e., true was $m = 22$. So in the following we aim to estimate these parameters.

After warm-up iterations, we collect statistics from the sampled chain $\theta^{(k)} := \left(m^{(k)},\lambda_1^{(k)},\lambda_2^{(k)}\right)$. The desired quantities we want are,

1.

*Posterior approximation of $p(m | x_{1:T})$:*To estimate the changepoint with full uncertainty. To this end, we collect our samples $m^{(k)}$ and compute the histogram of them. We obtain the following probability mass function for $m$.2.

*Sample means of $\lambda_1$, $\lambda_2$*: We collect samples $\lambda_1^{(k)}$ and $\lambda_2^{(k)}$, and compute their means. Computed means are $\tilde{\lambda}_1 = 3.66$ and $\tilde{\lambda}_2 = 6.97$.

## 0 comments:

## Post a Comment