Estimates are sometime our only option.

We almost never have the whole population to our hands and can therefore almost never poll the whole population or calculate things given the whole population. That might be because the logistical effort would be to big (imagine one would poll all 300 million people in the US every time one does a poll) or we just don’t have access to the whole population, e.g we can’t check every atom on how long it takes until it decays. But given our ability to use math, we can estimate things quite accurately. These estimations become more accurate, the bigger the number of given data-points becomes. One example for such estimations was the Law of Large Numbers which we spoke about in the last blog-post. The blog-post today is about the Central Limit Theorem which is another way to estimate things.

Standardisation

To understand the Central Limit Theorem one has to understand standardisation.

The standardisation for a random variable X with mean $\mu$ and standard deviation $\sigma$ is:

$Z=\frac{ X-\mu }{ \sigma }$

Z is a new random Variable with mean $\mu_{ Z }=0$ and variance $\sigma_{ Z }^{ 2 }=1$

If X was normal distributed then Z is standard normal.

Central Limit Theorem (CLT)

Let us now come to the actual Central Limit Theorem. Suppose $X_{ 1 }, X_{ 2 }, X_{ 3 }, ..., X_{ n }$ are i.i.d random variables. Then:

$S_{ n }=X_{ 1 } + X_{ 2 } + X_{ 3 } + ... + X_{ n }=\sum_{ i=1 }^{ n }{ X_{ i } }$

$\overline { X }_{ n }=\frac{ S_{ n } }{ n }$

Furthermore for large n:

$E(S_{ n })=n\mu$ $Var(S_{ n })=n\sigma^{ 2 }$

$E(\overline { X }_{ n })=\mu$ $Var(\overline { X }_{ n })=\frac{ \sigma^{ 2 } }{ n }$

The latter two are based on The Law of Large Numbers; As n converges to infinity the sample mean and therefore also the expectation of the sample mean approaches the population mean. Furthermore as n converges to infinity and the sample mean approaches the population mean, the variance of the sample mean converges to 0.

The Central Limit Theorem then says, for large n:

$\overline{ X }_{ n }\approx N(\mu, \frac{ \sigma^{ 2 } }{ n })$

$S_{ n }\approx N(n\mu,n\sigma^{ 2 })$

$Z\approx N(0,1)$

The standardisation of $\overline{ X }_{ n }$ and $S_{ n }$ is therefore approximately standard normal.

Applications of the Central Limit Theorem

Before we can have a look at some applications for the Central Limit Theorem, we have to introduce a few rules of thumb:

$P(|Z|<1)\approx 0.68\Rightarrow P(Z<1)\approx 0.84$

$P(|Z|<2)\approx 0.95\Rightarrow P(Z<2)\approx 0.977$

$P(|Z|<3)\approx 0.997\Rightarrow P(Z<3)\approx 0.999$

The above rules of thumb basically tell us the probability that the standardisation Z is in a specific range of $\sigma$.

Example: Suppose we toss a fair coin 100 times. What is the probability for more than 55 heads?

Answer: We answer this by using the Central Limit Theorem. Since the toss of a fair coin follows a Bernoulli Distribution, the mean $\mu = p$ and $\sigma = \sqrt{ p(1-p) }$. With a fair coin of probability 0.5 for heads the mean is 0.5 and the standard deviation 0.25. Following the rules mentioned above:

$E(S_{ n })=n\mu=100\cdot 0.5 = 50$

$Var(S_{ n })=n\sigma^{ 2 }=100\cdot 0.25 = 25 \Rightarrow \sigma = 5$

We can then calculate the probability for more than 55 heads as following:

$P(\frac{ S - 50 }{ 5 }>\frac{ 55 - 50 }{ 5 })\approx P(Z>1)\approx 0.16$

Example: Another application for the Central Limit Theorem is the margin of error. Suppose you’re polling n people on if they support A or B. The probability that a randomly chosen person supports A is $p_{ 0 }$. Since polling is also bernoulli distributed, $\mu = p_{0}$, $\sigma = \sqrt{ p_{ 0 }(1-p_{ 0 }) }$. $\overline{X}$ follows therefore $N(p_{ 0 }, \frac{ \sigma }{ \sqrt{ n } })$. 95% of the probability in a normal distribution is within 2$\sigma$ from the mean. We can derive this into a new rule of thumb. Since the maximum for $\sigma$ is 0.5: $\frac{ 2\sigma^{ 2 } }{ n }$ becomes $\frac{ 2\sigma }{ \sqrt{ n } }$ becomes $\frac{ 1 }{ sqrt{ n } }$ because $0.5\cdot2= 1$.  Therefore, the margin of error for a 95% confidence interval is approximately: $\overline{ X } \pm\frac{ 1 }{ \sqrt{ n } }$.

The Central Limit Theorem can come in handy in many cases it’s therefore good to understand at least the basic concept of the Central Limit Theorem.