Show code cell source
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib_inline
matplotlib_inline.backend_inline.set_matplotlib_formats('svg')
import seaborn as sns
sns.set_context("paper")
sns.set_style("ticks");
Visualizing Monte Carlo Uncertainty#
In the last two lectures, we repeatedly used the law of large numbers to estimate expectations using samples. In particular, we studied this integral:
where \(X\sim p(x)\) and \(g(x)\) is a function of \(x\). The sampling-based approximation required \(X_1,X_2,\dots\) be independent copies of \(X\). Then, we considered the random variables \(Y_1 = g(X_1), Y_2 = g(X_2), \dots\), which are also independent and identically distributed. The law of large states that their sampling average converges to their mean:
This is the Monte Carlo way of estimating integrals. If you played with the hands-on activities, you noticed that for small \( N \), we could get very different answers. Here we will build some intuition about this epistemic uncertainty induced by finite samples.
Example: 1D expectation#
Let’s try it out with the same test function we used before (Example 3.4 of [Robert and Casella, 2004]). Assume that \(X\sim\mathcal{U}([0,1])\) and pick:
The correct value for the expectation is:
Let’s calculate the Monte Carlo estimate a few times and visualize its uncertainty:
Show code cell source
# The function of x we would like to consider
g = lambda x: (np.cos(50 * x) + np.sin(20 * x)) ** 2
# How many times do you want to run MC
num_mc = 2
# Number of samples to take
N = 100
# A common plot for all estimates
fig, ax = plt.subplots()
# So do it ``num_mc`` times:
for i in range(num_mc):
# Generate samples from X
x_samples = np.random.rand(N)
# Get the corresponding Y's
y_samples = g(x_samples)
# Evaluate the sample average for all sample sizes
I_running = np.cumsum(y_samples) / np.arange(1, N + 1)
ax.plot(np.arange(1, N+1), I_running, 'b', lw=0.5)
# The true value
ax.plot(np.arange(1, N+1), [0.965] * N, color='r')
# and the labels
ax.set_xlabel('$N$')
ax.set_ylabel(r'$\bar{I}_N$')
sns.despine(trim=True);
Questions#
Run the code 2-3 times to observe that you get a slightly different answer every time.
Set the number of Monte Carlo samples
num_mcto 100 (or higher). Observe how different MC runs envelop the correct answer. This is epistemic uncertainty. How can we get it without running this repeatedly?Now increase
Nto 10000 and see how the uncertainty disappears.