Explain in Common Man’s Terms #1 — Classical Central Limit Theorem and its Verification

Common Man
3 min readMar 4, 2021
Photo by Edge2Edge Media on Unsplash

This is a new series of me trying to explain something I just learnt. It will not be a very comprehensive explanation. But it will be my understanding of the concept. These posts will get revisited and updated as my understanding of this concept changed.

Classical Central Limit Theorem and its verification

In essence, there are three things central limit theorem are trying to establish, which are:

  1. Regardless of the population distribution pattern, most of the time, the distribution of its sample mean will be a normal distribution
  2. Sample mean average will be close to the population mean
  3. Standard deviation of the sample mean distribution is equal to the standard deviation of the population divided by the sample size (a.k.a standard error).

Of course, there are a few assumptions and conditions¹:

  1. Samples must be picked randomly
  2. Sample size must be sufficiently large (Usually 30 is considered sufficient)
  3. Sample size should be no more than 10% of the population, when sample is drawn without replacement

Below is the verification:

I have chosen three continuous distribution:

  • Uniform Distribution
  • Normal Distribution
  • Exponential Distribution

Two discrete distribution:

  • Poisson Distribution
  • Bernoulli Distribution

First, 10000 random number will be generated based on their respective distribution pattern using Python SciPy package. And they will be treated as the population.

The distribution of the population will be plotted. Population’s average and standard error will be printed for comparison later.

Second, a function will be used to randomly take a given number of samples (i.e. 100) from the population and calculate their average. The number of iteration will be given to determine how many times this action will be repeated.

The distribution of the sample mean will be plotted. Average of the sample mean and standard deviation will be printed.

And below is the results

Population Distribution (Left) and its Sample Mean Distribution (Right) for Uniform Distribution (Given sample number of 100 and number of iterations of 1000)
Population Distribution (Left) and its Sample Mean Distribution (Right) for Normal Distribution (Given sample number of 100 and number of iterations of 1000)
Population Distribution (Left) and its Sample Mean Distribution (Right) for Exponential Distribution (Given sample number of 100 and number of iterations of 1000)
Population Distribution (Left) and its Sample Mean Distribution (Right) for Poisson Distribution (Given sample number of 100 and number of iterations of 1000)
Population Distribution (Left) and its Sample Mean Distribution (Right) for Poisson Distribution (Given sample number of 100 and number of iterations of 1000)
Population Distribution (Left) and its Sample Mean Distribution (Right) for Bernoulli Distribution (Given sample number of 100 and number of iterations of 1000)

As shown above, we can see at least for these all five distributions. The central limit theorem stands. Although the normal distribution might not be very obvious for some distribution, but once you increase the number of iterations it will become more obvious.

Please click the link here to see the python code in Jupyter notebook format, you can tweak the sample number and number of iterations to see what changes it made. Basically, you do not need a large sample number or number of iterations to find out the average of the population.

Reference:

[1] Central Limit Theorem: Assumptions and Conditions (cnx.org)

Resources helped me to understand this concept better:

[1] The Central Limit Theorem — YouTube

[2] Real-world application of the Central Limit Theorem (CLT) — YouTube

--

--

Common Man

An Individual who is passionate about data analyzing and AI