Understanding Confidence Intervals with an Intuitive Example

The concept of confidence intervals (CI) is commonly used in data science. Hence, using an intuitive example, let us learn it with confidence!

Imagine you are waiting for the bus at a bus stop. Usually, the bus arrives at 9.30 am. But the arrival time varies.

Another person arrives at the bus stop to catch the same bus and asks you, "Based on your experience, between 9.25 am to 9.35 am, what percentage of the time the bus arrived here?"

You think and answer, "90% of the time".

He asks again, "And what about between 9.20 am to 9.40 am?"

You answer,  "95% of the time".

This is the main logic behind confidence intervals. 

The confidence intervals provide an estimated range of values based on certain confidence levels. Here, 90% and 95% refer to the confidence levels. More common is the 95% confidence, while 90% and 99% are also rarely used.

A) 95% Confidence Interval 

The 95% Confidence Interval = [9.20 am - 9.40 am] = 9.30 am ± 10 minutes.

Just for easier understanding, I have plotted the same below. (Note that confidence intervals and prediction intervals are different. The following graphics are only for understanding.)

95% confidence interval - wider than 90% CI

B) 90% Confidence Interval 

The 90% Confidence Interval = [9.25 am - 9.35 am] = 9.30 am ± 5 minutes.

90% confidence interval - narrower
Formula for Confidence Interval
In our 95% Confidence Interval of 9.30 am ± 10 minutes, 10 minutes is the margin of error.
As you can see, the margin of error is derived from three values: 1) z value, 2) standard deviation and 3) sample size. Hence, the confidence interval is directly related to standard deviation and inversely related to sample size.

Points to note
  • A higher confidence level produces a wider confidence interval as shown above
    • 90% CI narrowest, 95% CI wider than 90% CI. 99% CI would be the widest
  • The higher the variability in the sample, the wider will be the confidence interval. 
  • Keeping other variables constant, a larger sample leads to a narrower confidence interval
  • Understanding how to calculate and interpret confidence intervals is an important skill for any data scientist