Monte Carlo

2021-04-24 | Modified: 2021-04-24 | Statistics ·

Monte Carlo methods are a class of sampling algorithms to obtain numerical approximations. To do this we rely on the weak law of large numbers:

The weak law of large numbers

A sequence of random variables \(X_1, X_2, \cdots, X_n\) converges in probability to a random variable \(X\) if for all \(\delta > 0\)

$$ \lim_{n\rightarrow \infty} \mathbb P(|X_n - X | \geq \delta) = 0. $$

This is also written as \(X_n \overset{\text{p}}\rightarrow X\)

A consequence of this taking using the empirical mean \(\overline{X}_n = \frac{1}{n} \sum_{i=1}^n X_i\)

If the sequence \(X_1, X_2, \cdots\) are independent and identically distributed (i.i.d) with \(\mathbb E(X_1) = \mu\) and finite variance \(\sigma^2 < \infty\), we have \(\overline{X}_n \overset{\text{p}}\rightarrow \mu\)

For an illustrated example we'll use an abstract example of estimating the probability that \(X \in A\) for a set \(A \subset \mathbb R\). We also assume \(X\) has probability density function \(\pi\).

Defining the estimator of a sample landing in \(A\) as \(\hat{p} := \boldsymbol{1}_{X_n \in A}\) we notice that \(\mathbb P(X \in A) = \mathbb E[\boldsymbol{1}_{X_n \in A}]\) thus we approximate \(\mathbb P(X \in A)\) with the empirical mean \(\frac{1}{n} \sum_{i=1}^n \boldsymbol{1}_{X_i \in A}\)

where

$$\boldsymbol{1}_{X_i \in A} = \left\{ \begin{array}{ll} 1 & X_i \in A \\ 0 & \text{otherwise} \end{array} \right.$$

The estimator \(\hat{p}\) describes the proportion of samples landing in \(A\). If we define \(Z_n\) as the number of samples landing in \(A\) the we have the relationship \(\hat{p} = \dfrac{Z_n}{n}\).

It follows that the number of samples that land in \(A\) follows a Binomial distribution \(Z_n \sim \text{Bin}(n, p)\). This arises from considering the sum of \(n\) independent Bernoulli trials.