Sampling: Rejection and Importance

We left the introduction to monte carlo simulations with its limitation of slow convergence.

In this post we will cover Rejection and Importance sampling.

For a density function $\pi(x)$, how can we sample from it?

Rejection Sampling

The idea is to sample from a distribution $q$ that contains $\pi$ for all $x$, that is $\pi \leq M \cdot q$ for all $x$ and a constant $M$. The algorithm is a two step process:

Rejection Sampling

Sample $X \sim q$ and sample $U \sim U[0, 1]$.
If $\mathbb P(U < \frac{\pi(X)}{Mq(X)})$
- accept $X$ as a sample from $\pi$
- otherwise reject $X$ and repeat the sampling step.

To show this works we prove: For any $a, b \in \mathbb R$, $a \leq b$.

$$ \mathbb P(X \in [a, b] \mid X \text{ accepted}) = \int_a^b \pi(x) dx $$

Proof:

$$ \begin{aligned} \mathbb P(X \in [a, b] \cap \text{ accepted}) &= \int_a^b \mathbb P(X \text{ accepted} \mid X = x)q(x) dx \\ &= \int_a^b \frac{\pi(X)}{Mq(X)}q(x) dx \\ &= \frac{1}{M} \int_a^b \pi(x) dx \\ \Rightarrow \mathbb P( X \text{ accepted}) &= \frac{1}{M} \end{aligned} $$

Using Bayes' theorem

$$ \begin{aligned} \mathbb P(X \in [a, b] \mid X \text{ accepted}) &= \frac{\mathbb P(X \in [a, b] \cap \text{ accepted})}{\mathbb P(X \text{ accepted})} \\ &= \frac{\frac{1}{M} \int_a^b \pi(x) dx}{\frac{1}{M}} \\ &= \int_a^b \pi(x) dx \end{aligned} $$

An important consequence of this is the probability until an acceptance $\mathbb P(X \text{ accepted}) = \frac{1}{M}$ is distributed geometrically $Y \sim \text{Geometric}(\frac{1}{M})$. The mean time until an acceptance is then $\mathbb E[Y] = M$.

Importance sampling

We want to evaluate integrals in the form: $\int \varphi(x)\pi(x) dx$ where $\pi$ is a probability distribution and $\varphi(x)$ is a function. Sometimes $\pi$ is tricky to evaluate, maybe the integral is intractable. Importance sampling introduces a proposal distribution $q$ such that

$$ \{x : \pi(x) > 0\} \subseteq \{x : q(x) > 0\} $$

or alternatively, there exists a constant $M$ such that $\pi(x) \leq M \cdot q(x)$ for all $x$ which allows us to express our integral as:

$$\begin{aligned} \mathbb E_{\pi}[\varphi(X)] &= \int \varphi(x)\pi(x) dx \\ &= \int \varphi(x)\frac{\pi(x)}{q(x)}q(x) dx \\ &= \int \varphi(x)w(x)q(x) dx \\ &= \mathbb E_{q}[\varphi(X)w(X)]. \end{aligned}$$

Using the equivalence above, letting $X_i \overset{i.i.d}\sim q$ and appealing to the weak law of large numbers we see

$$ \frac{1}{n} \sum_{i=1}^n \varphi(X_i)w(X_i) \overset{\text{p}}\rightarrow \mathbb E_{\pi}[\varphi(X)] $$

The algorithm is as follows:

Importance Sampling

Choose $q$ such that $\{x : \pi(x) > 0\} \subseteq \{x : q(x) > 0\}$.

For $i = 1, 2, \cdots, n$:
- Draw $X_i \sim q$
- Calculate importance weights $w(X_i) = \frac{\pi(X_i)}{q(X_i)}$
Calculate $\bar{\varphi}_n = \frac{1}{n}\sum_{i=1}^n \varphi(X_i)w(X_i)$, the unbiased estimator of $\mathbb E_\pi[\varphi(X)]$.