Sampling: Rejection and Importance

2021-04-25 | Modified: 2021-04-26 | Statistics ·

We left the introduction to monte carlo simulations with its limitation of slow convergence.

In this post we will cover Rejection and Importance sampling.

For a density function \(\pi(x)\), how can we sample from it?

Rejection Sampling

The idea is to sample from a distribution \(q\) that contains \(\pi\) for all \(x\), that is \(\pi \leq M \cdot q\) for all \(x\) and a constant \(M\). The algorithm is a two step process:

Rejection Sampling


  1. Sample \(X \sim q\) and sample \(U \sim U[0, 1]\).
  2. If \(\mathbb P(U < \frac{\pi(X)}{Mq(X)})\)
    • accept \(X\) as a sample from \(\pi\)
    • otherwise reject \(X\) and repeat the sampling step.

To show this works we prove: For any \(a, b \in \mathbb R\), \(a \leq b\).

$$ \mathbb P(X \in [a, b] \mid X \text{ accepted}) = \int_a^b \pi(x) dx $$

Proof:

$$ \begin{aligned} \mathbb P(X \in [a, b] \cap \text{ accepted}) &= \int_a^b \mathbb P(X \text{ accepted} \mid X = x)q(x) dx \\ &= \int_a^b \frac{\pi(X)}{Mq(X)}q(x) dx \\ &= \frac{1}{M} \int_a^b \pi(x) dx \\ \Rightarrow \mathbb P( X \text{ accepted}) &= \frac{1}{M} \end{aligned} $$

Using Bayes' theorem

$$ \begin{aligned} \mathbb P(X \in [a, b] \mid X \text{ accepted}) &= \frac{\mathbb P(X \in [a, b] \cap \text{ accepted})}{\mathbb P(X \text{ accepted})} \\ &= \frac{\frac{1}{M} \int_a^b \pi(x) dx}{\frac{1}{M}} \\ &= \int_a^b \pi(x) dx \end{aligned} $$

An important consequence of this is the probability until an acceptance \(\mathbb P(X \text{ accepted}) = \frac{1}{M}\) is distributed geometrically \(Y \sim \text{Geometric}(\frac{1}{M})\). The mean time until an acceptance is then \(\mathbb E[Y] = M\).

Importance sampling

We want to evaluate integrals in the form: \(\int \varphi(x)\pi(x) dx\) where \(\pi\) is a probability distribution and \(\varphi(x)\) is a function. Sometimes \(\pi\) is tricky to evaluate, maybe the integral is intractable. Importance sampling introduces a proposal distribution \(q\) such that

$$ \{x : \pi(x) > 0\} \subseteq \{x : q(x) > 0\} $$

or alternatively, there exists a constant \(M\) such that \(\pi(x) \leq M \cdot q(x)\) for all \(x\) which allows us to express our integral as:

$$\begin{aligned} \mathbb E_{\pi}[\varphi(X)] &= \int \varphi(x)\pi(x) dx \\ &= \int \varphi(x)\frac{\pi(x)}{q(x)}q(x) dx \\ &= \int \varphi(x)w(x)q(x) dx \\ &= \mathbb E_{q}[\varphi(X)w(X)]. \end{aligned}$$

Using the equivalence above, letting \(X_i \overset{i.i.d}\sim q\) and appealing to the weak law of large numbers we see

$$ \frac{1}{n} \sum_{i=1}^n \varphi(X_i)w(X_i) \overset{\text{p}}\rightarrow \mathbb E_{\pi}[\varphi(X)] $$

The algorithm is as follows:

Importance Sampling


Choose \(q\) such that \(\{x : \pi(x) > 0\} \subseteq \{x : q(x) > 0\}\).

  1. For \(i = 1, 2, \cdots, n\):
    • Draw \(X_i \sim q\)
    • Calculate importance weights \(w(X_i) = \frac{\pi(X_i)}{q(X_i)}\)
  2. Calculate \(\bar{\varphi}_n = \frac{1}{n}\sum_{i=1}^n \varphi(X_i)w(X_i)\), the unbiased estimator of \(\mathbb E_\pi[\varphi(X)]\).