Notes on MCMC

MCMC (Markov Chain Monte Carlo) is a way to numerically sample from posterior distribution of interest by constructing Markov Chain in a smart way so that the stationary distribution of MC matches the desired posterior distribution.

The way I see it is to consider Markov Chain as a search on the parameter space. The goal is to find a parameter $\theta$ that specifies the desired posterior distribution $p(\theta|x)$.

To have a concrete picture, it’s useful to consider the parameter space (or should I say distribution space, which emphasizes that the space is invariance to reparametrization?) as a discrete space.

Let’s say we partition the parameter space into a countable blocks, and we start from a certain block at time 0.
Suppose at time $t$, we are at block $x$. Denote this probability as $P(\theta_t \in x)$

How should we move?

  • we should move toward regions of the parameter space with higher probability (with respect to the target distribution? But we don’t know this distribution do we..? Oh but we do know the likelihood and prior. So, although we cannot calculate the posterior distribution exactly, we are able to evaluate if the current point is higher than the next point, because this evaluation can be done by $\frac{p(\theta{t+1})p(x|\theta{t+1})}{p(\theta{t})p(x|\theta{t})}$; the integral constant will be cancelled out when you take the proportion.)

  • we should avoid the regions with lower probability with respect to the target distribution.

Transition matrix T(x|y) $\iff$ Proposal distribution with pdf $g(x|x’)$ defined for all $x,x’ \in \xx$.

Reference