Today i will teach you about Confidence Intervals for the Mean When σ Is Unknown
When σ is known and the sample size is 30 or more, or the population is normally distributed if the sample size is less than 30, the confidence interval for the mean can be found by using the z distribution, as shown in Section 7–1. However, most of the time, the value of σ is not known, so it must be estimated by using s, namely, the standard deviation of the sample. When s is used, especially when the sample size is small, critical values greater than the values for zα/2 are used in confidence intervals in order to keep the interval at a given confidence level, such as 95%. These values are taken from the Student’s t distribution, most often called the t distribution.
To use this method, the samples must be simple random samples, and the population from which the samples were taken must be normally or approximately normally distributed, or the sample size must be 30 or more.
Some important characteristics of the t distribution are described now.
Characteristics of the t Distribution
The t distribution shares some characteristics of the standard normal distribution and differs from it in others. The t distribution is similar to the standard normal distribution in these ways:
- It is bell-shaped.
- It is symmetric about the mean.
- The mean, median, and mode are equal to 0 and are located at the center of the distribution.
- The curve approaches but never touches the x axis.
The t distribution differs from the standard normal distribution in the following ways:
- The variance is greater than 1.
- The t distribution is actually a family of curves based on the concept of degrees of freedom, which is related to sample size.
- As the sample size increases, the t distribution approaches the standard normal distribution. See Figure 7–6.FIGURE 7–6The t Family of Curves
Many statistical distributions use the concept of degrees of freedom, and the formulas for finding the degrees of freedom vary for different statistical tests. The degrees of freedom are the number of values that are free to vary after a sample statistic has been computed, and they tell the researcher which specific curve to use when a distribution consists of a family of curves.
For example, if the mean of 5 values is 10, then 4 of the 5 values are free to vary. But once 4 values are selected, the fifth value must be a specific number to get a sum of 50, since 50 ÷ 5 = 10. Hence, the degrees of freedom are 5 − 1 = 4, and this value tells the researcher which t curve to use.
The symbol d.f. will be used for degrees of freedom. The degrees of freedom for a confidence interval for the mean are found by subtracting 1 from the sample size. That is, d.f. = n – 1. Note: For some statistical tests used later in this book, the degrees of freedom are not equal to n − 1.Page 376
The formula for finding the confidence interval using the t distribution has a critical value tα/2.
The values for tα/2 are found in Table F in Appendix A. The top row of Table F, labeled Confidence Intervals, is used to get these values. The other two rows, labeled One tail and Two tails, will be explained in Chapter 8 and should not be used here.
Find the tα/2 value for a 95% confidence interval when the sample size is 22.SOLUTION
The d.f. = 22 − 1, or 21. Find 21 in the left column and 95% in the row labeled Confidence Intervals. The intersection where the two meet gives the value for tα/2, which is 2.080. See Figure 7–7.FIGURE 7–7Finding tα/2 for Example 7–5
When d.f. is greater than 30, it may fall between two table values. For example, if d.f. = 68, it falls between 65 and 70. Many texts say to use the closest value, for example, 68 is closer to 70 than 65; however, in this text a conservative approach is used. In this case, always round down to the nearest table value. In this case, 68 rounds down to 65.
Note: At the bottom of Table F where d.f. is large or ∞, the zα/2 values can be found for specific confidence intervals. The reason is that as the degrees of freedom increase, the t distribution approaches the standard normal distribution.
The formula for finding a confidence interval for the mean by using the t distribution is given next.Formula for a Specific Confidence Interval for the Mean When σ Is Unknown
The degrees of freedom are n – 1.
The assumptions for finding a confidence interval for a mean when σ is unknown are given next.Page 377Assumptions for Finding a Confidence Interval for a Mean When σ Is Unknown
- The sample is a random sample.
- Either n ≥ 30 or the population is normally distributed when n < 30.
In this text, the assumptions will be stated in the exercises; however, when encountering statistics in other situations, you must check to see that these assumptions have been met before proceeding.EXAMPLE 7–6Temperature on Thanksgiving Day
A random sample of high temperatures for 12 recent Thanksgiving Days had an average of 42°F. Assume the variable is normally distributed and the standard deviation of the sample temperatures was 8°F. Find the 95% confidence interval of the population mean for the temperatures.SOLUTION
Since σ is unknown and s must replace it, the t distribution (Table F) must be used for the confidence interval. Hence, with 11 degrees of freedom, tα/2 = 2.201. The 95% confidence interval can be found by substituting in the formula:
Therefore, one can be 95% confident that the population mean for the temperatures on Thanksgiving Days is between 36.9°F and 47.1°F.EXAMPLE 7–7Home Fires Started by Candles
The data represent a random sample of the number of home fires started by candles for the past several years. (Data are from the National Fire Protection Association.) Find the 99% confidence interval for the mean number of home fires started by candles each year.
Find the mean and standard deviation for the data. Use the formulas in Chapter 3 or your calculator. The mean . The standard deviation s = 1610.3.Step 2
Find tα/2 in Table F. Use the 99% confidence interval with d.f. = 6. It is 3.707.Step 3
Substitute in the formula and solve.
One can be 99% confident that the population mean number of home fires started by candles each year is between 4785.2 and 9297.6, based on a sample of home fires occurring over a period of 7 years.Page 378
Students sometimes have difficulty deciding whether to use zα/2 or tα/2 values when finding confidence intervals for the mean. As stated previously, when σ is known, zα/2 values can be used no matter what the sample size is, as long as the variable is normally distributed or n ≥ 30. When σ is unknown and n ≥ 30, then s can be used in the formula and tα/2 values can be used. Finally, when σ is unknown and n < 30, s is used in the formula and tα/2 values are used, as long as the variable is approximately normally distributed. These rules are summarized in Figure 7–8.FIGURE 7–8When to Use the z or t Distribution