Today I will teach you about The Normal Distribution
Normal Distributions
In mathematics, curves can be represented by equations. For example, the equation of the circle shown in Figure is x2 + y2 = r2, where r is the radius. A circle can be used to represent many physical objects, such as a wheel or a gear. Even though it is not possible to manufacture a wheel that is perfectly round, the equation and the properties of a circle can be used to study many aspects of the wheel, such as area, velocity, and acceleration. In a similar manner, the theoretical curve, called a normal distribution curve, can be used to study many variables that are not perfectly normally distributed but are nevertheless approximately normal. FIGURE 6–2Graph of a Circle and an Application
If a random variable has a probability distribution whose graph is continuous, bell-shaped, and symmetric, it is called a normal distribution. The graph is called a normal distribution curve.
The mathematical equation for a normal distribution is
where e ≈ 2.718 (≈ means “is approximately equal to”)
π | ≈ | 3.14 |
μ | = | population mean |
σ | = | population standard deviation |
This equation may look formidable, but in applied statistics, tables or technology is used for specific problems instead of the equation.
Another important consideration in applied statistics is that the area under a normal distribution curve is used more often than the values on the y axis. Therefore, when a normal distribution is pictured, the y axis is sometimes omitted.
Circles can be different sizes, depending on their diameters (or radii), and can be used to represent wheels of different sizes. Likewise, normal curves have different shapes and can be used to represent different variables.
The shape and position of a normal distribution curve depend on two parameters, the mean and the standard deviation. Each normally distributed variable has its own normal distribution curve, which depends on the values of the variable’s mean and standard deviation.
Suppose one normally distributed variable has μ = 0 and σ = 1, and another normally distributed variable has μ = 0 and σ = 2. As you can see in Figure below (a), when the value of the standard deviation increases, the shape of the curve spreads out. If one normally distributed variable has μ = 0 and σ = 2 and another normally distributed variable has μ = 2, and σ = 2, then the shapes of the curve are the same, but the curve with μ = 2 moves 2 units to the right. See below (b).
FIGURE 6–3Shapes of Normal Distributions
Summary of the Properties of the Theoretical Normal Distribution
- A normal distribution curve is bell-shaped.
- The mean, median, and mode are equal and are located at the center of the distribution.
- A normal distribution curve is unimodal (i.e., it has only one mode).
- The curve is symmetric about the mean, which is equivalent to saying that its shape is the same on both sides of a vertical line passing through the center.
- The curve is continuous; that is, there are no gaps or holes. For each value of X, there is a corresponding value of Y.
- The curve never touches the x axis. Theoretically, no matter how far in either direction the curve extends, it never meets the x axis—but it gets increasingly close.
- The total area under a normal distribution curve is equal to 1.00, or 100%. This fact may seem unusual, since the curve never touches the x axis, but one can prove it mathematically by using calculus. (The proof is beyond the scope of this text.)
- The area under the part of a normal curve that lies within 1 standard deviation of the mean is approximately 0.68, or 68%; within 2 standard deviations, about 0.95, or 95%; and within 3 standard deviations, about 0.997, or 99.7%. See Figure 6–4, which also shows the area in each region.
FIGURE 6–4: Areas Under a Normal Distribution Curve
The values given in item 8 of the summary follow the empirical rule for data given in Section 3–2.
You must know these properties in order to solve problems involving distributions that are approximately normal.
Recall from Chapter 2 that the graphs of distributions can have many shapes. When the data values are evenly distributed about the mean, a distribution is said to be a symmetric distribution. (A normal distribution is symmetric.) Figure 6–5(a) shows a symmetric distribution. When the majority of the data values fall to the left or right of the mean, the distribution is said to be skewed. When the majority of the data values fall to the right of the mean, the distribution is said to be a negatively or left-skewed distribution. The mean is to the left of the median, and the mean and the median are to the left of the mode. See Figure 6–5(b). When the majority of the data values fall to the left of the mean, a distribution is said to be a positively or right-skewed distribution. The mean falls to the right of the median, and both the mean and the median fall to the right of the mode. See Figure 6–5(c).
FIGURE 6–5: Normal and Skewed Distributions
The “tail” of the curve indicates the direction of skewness (right is positive, left is negative). These distributions can be compared with the ones shown in Figure 3–1. Both types follow the same principles.
The Standard Normal Distribution
Since each normally distributed variable has its own mean and standard deviation, as stated earlier, the shape and location of these curves will vary. In practical applications, then, you would have to have a table of areas under the curve for each variable. To simplify this situation, statisticians use what is called the standard normal distribution.
The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1.
The standard normal distribution is shown in Figure 6–6.
FIGURE 6–6 Standard Normal Distribution
Page 308
The values under the curve indicate the proportion of area in each section. For example, the area between the mean and 1 standard deviation above or below the mean is about 0.3413, or 34.13%.
The formula for the standard normal distribution is
All normally distributed variables can be transformed into the standard normally distributed variable by using the formula for the standard score:
This is the same formula used in Section 3–3. The use of this formula will be explained in Section 6–3.
As stated earlier, the area under a normal distribution curve is used to solve practical application problems, such as finding the percentage of adult women whose height is between 5 feet 4 inches and 5 feet 7 inches, or finding the probability that a new battery will last longer than 4 years. Hence, the major emphasis of this section will be to show the procedure for finding the area under the standard normal distribution curve for any z value. The applications will be shown in Section 6–2. Once the X values are transformed by using the preceding formula, they are called z values. The z value or z score is actually the number of standard deviations that a particular X value is away from the mean. Table E in Appendix A gives the area (to four decimal places) under the standard normal curve for any z value from −3.49 to 3.49.