Bayes In the Real World: The Metropolis-Hastings Algorithm

Last Page

It can be shown that, over time, the list of samples $[x_0, x_1, x_2, \dots]$ will asymptotically match samples drawn from $P^*(x)$ : in the long run, this algorithm is completely correct.

Let me take a second to break this last step down and give a motivation for why this algorithm converges: feel free to skip this part if the math seems hairy.

We always move towards a new point that is more likely, and occasionally we make moves to a less likely location. Imagine we have two points $x$ and $y$ : over a long time, roughly how often will $x$ and $y$ appear in our list of samples relative to each other? For the sake of argument, we will assume that $P(x) > P(y)$ .

The chance we go from $x$ to $y$ is $Q_x(y) \frac{P^*(y)}{P^*(x)}$ . It's the chance we propose $y$ multiplied by the chance we then accept $y$ , which is $\alpha$ .
The chance we go from $y$ to $x$ is $Q_y(x) = Q_x(y)$ , because once we propose $x$ we accept it with an 100% probability. We're assuming that $Q$ is symmetric.

It can be shown that, over time, the list of samples $[x_0, x_1, x_2, \dots]$ will asymptotically match samples drawn from $P^*(x)$ : in the long run, this algorithm is completely correct.

Let me take a second to break this last step down and give a motivation for why this algorithm converges: feel free to skip this part if the math seems hairy.

The chance we go from $x$ to $y$ is $Q_x(y) \frac{P^*(y)}{P^*(x)}$ . It's the chance we propose $y$ multiplied by the chance we then accept $y$ , which is $\alpha$ .
The chance we go from $y$ to $x$ is $Q_y(x) = Q_x(y)$ , because once we propose $x$ we accept it with an 100% probability. We're assuming that $Q$ is symmetric.