Introduction to Normal Distribution
The Normal Distribution, also sometimes known as Gaussian Distribution is a family of beautiful bell-shaped curves. It is indeed the most important concept in statistics. A Normal Curve depicts the distribution for any experiment repeated for a large number of times.
As the name suggests, the normal distribution is so normal that we can find it anywhere. Just name it! Be the size of micro-organisms or the giant planets. Monetary income, or Student grades. When we take measurements for any case a large number of times, it leads us to a bell-like symmetric Normal Distribution.
The average of many observations of any random variable with finite mean and variance is also a random variable. The distribution for the resultant random variable converges to a Normal Distribution.
Note that every Normal Distribution generally has two parameters μ and σ, which we will discuss. These parameters define the shape of a distribution.
Properties of Normal Curve
As the name suggests, unimodal distributions have only one mode. We can say any distribution that has a single peak is called a Unimodal Distribution.
A distribution where the shape of the curve to the left of the peak mirrors the shape of the curve to the right of the peak. Skewness is generally considered as a measure of asymmetry.
Such a distribution is unique because the mean, median, and mode all lie at a single point.
- Asymptotic Tail
On both sides, the tails of a normal curve may seem to be meeting the x-axis or having y=0 at some value of x. But in reality, they always approach the x-axis but never touch it.
Kurtosis is one of the measures that describe the shape of a distribution’s tail compared to its peakedness in the center.
Types of Normal Curves
The values μ and σ tell us about the mean and standard deviation of the distribution. Mean is a measure of centrality, which balances the distribution. Whereas Standard deviation acts as a measure of spread, which decides how flat our distribution is. Normal Distributions are classified into types, based on the values that μ and σ can take.
- Standard Normal Distribution
It is the simplest and commonly used case. A standard normal distribution has μ=0 and σ=1.
- General Normal Distribution
All the general cases of a normal distribution are also a version of standard normal distribution. In all the cases, the distribution is stretched by some factor of σ and shifted by some value μ.
The Empirical Rule
The Empirical Rule is also known sometimes as the “68-95-99.7 rule” or the “three-sigma rule“. It states that for any observable normal distribution, almost all the data lies 3 standard deviations away from the mean.
According to the empirical rule, 68.27% of the data can be observed within the range (μ±σ). In other words, the area under the curve in pink color is 68.27% of the total area under the curve. Similarly, 95.45% and 99.73% is the area under the curve having limits (μ ± 2*σ) and (μ ± 3*σ) respectively.
We can often use the empirical rule for rough estimates to test whether or not a given distribution is normal or not by simply checking whether how much proportion of the data lies away from 3 standard deviations from the mean, or away from (μ ± 3*σ).
The idea of a Normal distribution is so important yet simple, that it has become a universal basis law for almost all statistical methods. Some commonly used examples are Regression analysis, Analysis of variance, and many Parameter estimation methods as well.
Central Limit Theorem is one of the most important applications of a Normal Distribution. It also plays an essential part in Hypothesis Testing where the assumption is made that the data follows a normal distribution.