03: Monte Carlo Simulation

Monte Carlo Simulation: Introduction

What is ti?
Question
Key part of MC simulations

It is a way to test econometric theories via simulation.

How is it used in econometrics?

confirm ecoometric theory numerically
- OLS estimators are unbiased if \(E[u|x]=0\) along with other conditions (theory)
- I know the above theory is right, but let’s check if it is true numerically
You kind of sense that something in your data may cause problems, but there is no proven econometric theory about what’s gonna happen (I used MC simulation for this purpose a lot)
assist students in understanding econometric theories by providing actual numbers instead of a series of Greek letters

Suppose you are interested in checking what happens to OLS estimators if \(E[u|x]=0\) (the error term and \(x\) are not correlated) is violated.

Question

Can you use the real data to do this?

Answer

No because you will never observe either error term or true value of \(\beta\)s.

You generate data (you have control over how data are generated)

You know the true parameter unlike the real data generating process
You can change only the part that you want to change about data generating process and econometric methods with everything else fixed

Pseudo random number generators (Pseudo RNG)

Algorithms for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers

Examples

Draw from a uniform distribution:

Numbers drawn using pseudo random number generators are not truly random

What numbers you will get are pre-determined
What numbers you will get can be determined by setting a seed

Demonstration

Question

What benefits does setting a seed have?

\(x \sim N(0, 1)\)

\(x \sim N(2, 2)\)

R functions for often-used distributions

Distribution types
dnorm()
pnorm()
qnorm()

Normal
Uniform
Beta
Chi-square
F
Logistic
Log-normal
many others

For each distribution, you have four different kinds of functions:

dnorm: density function
pnorm: distribution function
qnorm: quantile function
rnorm: random draw

dnorm(x) gives you the height of the density function at \(x\).

dnorm(-1) and dnorm(2)

pnorm(x) gives you the probability that a single random draw is less than \(x\).

pnorm(-1)
pnorm(2)
Exercise

What is the probability that a single random draw from a Normal distribution with mean = 1 and sd = 2 is less than 1?

Work here

Answer

pnorm(1, mean = 1, sd = 2)

What is it?
qnorm(0.95)
Exercise

qnorm(x), where \(0 < x < 1\), gives you a number \(\pi\), where the probability of observing a number from a single random draw is less than \(\pi\) with probability of \(x\).

We call the output of qnorm(x), \(x%\) quantile of the standard Normal distribution (because the default is mean = 0 and sd = 1 for rnorm()).

What is the 88% quantile of Normal distribution with mean = 0 and sd = 9?

Work here
Answer

Code

qnorm(0.88, mean = 0, sd = 9)

Monte Carlo Simulation: Introduction

Monte Carlo Simulation: Steps

specify the data generating process
generate data based on the data generating process
get an estimate based on the generated data (e.g. OLS, mean)
repeat the above steps many many times
compare your estimates with the true parameter

Question

Why do the steps \(1-3\) many many times?

Monte Carlo Simulation: Example 1

Problem
Steps 1-3
Sample Mean: Step 4
Loop: for loop
Step 4
Step 5

Question

Is sample mean really an unbiased estimator of the expected value?

That is, is \(E[\frac{1}{n}\sum_{i=1}^n x_i] = E[x]\), where \(x_i\) is an independent random draw from the same distribution,

repeat the above steps many times
We use a loop to do the same (similar) thing over and over again

R code

Verbally

For each of \(i\) in \(1:B\) \((1, 2, \dots, 1000)\), do print(i).

i takes the value of 1, and then print(1)
i takes the value of 2, and then print(2)
…
i takes the value of 999, and then print(999)
i takes the value of 1000, and then print(1000)

Compare your estimates with the true parameter

Monte Carlo Simulation: Example 2

Question
R code
Compare

Question

What happens to \(\beta_1\) if \(E[u|x]\ne 0\) when estimating \(y=\beta_0+\beta_1 x + u\)?

Monte Carlo Simulation: Example 3 (optional)

Question
R code
Check
Distribution

Model

\[\begin{aligned} y = \beta_0 + \beta_1 x + u \\ \end{aligned}\]

\(x\sim N(0,1)\)
\(u\sim N(0,1)\)
\(E[u|x]=0\)

Variance of the OLS estimator

True Variance of \(\hat{\beta_1}\): \(V(\hat{\beta_1}) = \frac{\sigma^2}{\sum_{i=1}^n (x_i-\bar{x})^2} = \frac{\sigma^2}{SST_X}\)

Its estimator: \(\widehat{V(\hat{\beta_1})} =\frac{\hat{\sigma}^2}{SST_X} = \frac{\sum_{i=1}^n \hat{u}_i^2}{n-2} \times \frac{1}{SST_X}\)

Question

Does the estimator really work? (Is it unbiased?)

True Variance

\(SST_X = 112.07\)
\(\sigma^2 = 4\)

\[V(\hat{\beta}) = 4/112.07 = 0.0357\]

Check

Your Estimates of Variance of \(\hat{\beta_1}\)?

Exercise (optional)

Problem
Solution
Results visualization

Using MC simulations, find out how the variation in \(x\) affects the OLS estimators

Model setup

\[\begin{align} y = \beta_0 + \beta_1 x_1 + u \\ y = \beta_0 + \beta_1 x_2 + u \end{align}\]

\(x_1\sim N(0,1)\) and \(x_2\sim N(0,9)\)
\(u\sim N(0,1)\)
\(E[u_1|x]=0\) and \(E[u_2|x]=0\)