Confidence Intervals for the Mean When σ Is Known

Today I will teach you how to Calculate Confidence Intervals for the Mean When σ Is Known

In order to compute the confidence interval for the mean when σ is known, there are two assumptions that must be met.

Assumptions for Finding a Confidence Interval for a Mean When σ Is Known

The sample is a random sample.
Either n ≥ 30 or the population is normally distributed when n < 30.

In this book, the assumptions will be stated in the exercises; however, when encountering statistics in other situations, you must check to see that these assumptions have been met before proceeding.

Some statistical techniques are called robust. This means that the distribution of the variable can depart somewhat from normality, and valid conclusions can still be obtained.

When you estimate a population mean, you should use the sample mean.

You might ask why other measures of central tendency, such as the median and mode, are not used to estimate the population mean. The reason is that the means of samples vary less than other statistics (such as medians and modes) when many samples are selected from the same population. Therefore, the sample mean is the best estimate of the population mean.

Formula for the Confidence Interval of the Mean for a Specific α When σ Is Known

For a 90% confidence interval, z_α/2 = 1.65; for a 95% confidence interval, z_α/2 = 1.96; and for a 99% confidence interval, z_α/2 = 2.58.

The term is called the margin of error (also called the maximum error of the estimate). For a specific value, say, α = 0.05, 95% of the sample means will fall within this error value on either side of the population mean, as previously explained.

Rounding Rule for a Confidence Interval for a Mean When you are computing a confidence interval for a population mean by using raw data, round off to one more decimal place than the number of decimal places in the original data. When you are computing a confidence interval for a population mean by using a sample mean and a standard deviation, round off to the same number of decimal places as given for the mean.

EXAMPLE 7–1 Days It Takes to Sell an Aveo

A researcher wishes to estimate the number of days it takes an automobile dealer to sell a Chevrolet Aveo. A random sample of 50 cars had a mean time on the dealer’s lot of 54 days. Assume the population standard deviation to be 6.0 days. Find the best point estimate of the population mean and the 95% confidence interval of the population mean.

Source: Based on information obtained from Power Information Network.

SOLUTION

The best point estimate of the population mean is 54 days. For the 95% confidence interval use z = 1.96.

Hence, one can say with 95% confidence that the interval between 52.3 and 55.7 days does contain the population mean, based on a sample of 50 automobiles.

EXAMPLE 7–2 Number of Patients

A large urgent care center with 4 doctors found that they can see an average of 18 patients per hour. Assume the standard deviation is 3.2. A random sample of 42 hours was selected. Find the 99% confidence interval of the mean.

SOLUTION

The best point estimate of the population mean is 18. The 99% confidence level is

Hence, one can be 99% confident (rounded values) that the mean number of patients that the center can care for in 1 hour is between 17 and 19.

According to the central limit theorem, approximately 95% of the sample means fall within 1.96 standard deviations of the population mean if the sample size is 30 or more, or if σ is known when n is less than 30 and the population is normally distributed. If it were possible to build a confidence interval about each sample mean, as was done in Examples 7–1 and 7–2 for μ, then 95% of these intervals would contain the population mean, as shown in Figure 7–3. Hence, you can be 95% confident that an interval built around a specific sample mean would contain the population mean. If you desire to be 99% confident, you must enlarge the confidence intervals so that 99 out of every 100 intervals contain the population mean.

FIGURE 7–3

95% Interval for Sample Means

Since other confidence intervals (besides 90, 95, and 99%) are sometimes used in statistics, an explanation of how to find the values for z_α/2 is necessary. As stated previously, the Greek letter α represents the total of the areas in both tails of the normal distribution. The value for α is found by subtracting the decimal equivalent for the desired confidence level from 1. For example, if you wanted to find the 98% confidence interval, you would change 98% to 0.98 and find α = 1 − 0.98, or 0.02. Then α/2 is obtained by dividing α by 2. So α/2 is 0.02/2, or 0.01. Finally, z_0.01 is the z value that will give an area of 0.01 in the right tail of the standard normal distribution curve. See Figure 7–4.

FIGURE 7–4

Finding α/2 for a 98% Confidence Interval

Once α/2 is determined, the corresponding z_α/2 value can be found by using the procedure shown in Chapter 6, which is reviewed here. To get the z_α/2 value for a 98% confidence interval, subtract 0.01 from 1.0000 to get 0.9900. Next, locate the area that is closest to 0.9900 (in this case, 0.9901) in Table E, and then find the corresponding z value. In this example, it is 2.33. See Figure 7–5.

FIGURE 7–5

Finding z_α/2 for a 98% Confidence Interval

For confidence intervals, only the positive z value is used in the formula.

When the original variable is normally distributed and σ is known, the standard normal distribution can be used to find confidence intervals regardless of the size of the sample. When n ≥ 30, the distribution of means will be approximately normal even if the original distribution of the variable departs from normality.

EXAMPLE 7–3 Credit Union Assets

The following data represent a random sample of the assets (in millions of dollars) of 30 credit unions in southwestern Pennsylvania. Assume the population standard deviation is 14.405. Find the 90% confidence interval of the mean.

12.23 16.56 4.39

2.89 1.24 2.17

13.19 9.16 1.42

73.25 1.91 14.64

11.59 6.69 1.06

8.74 3.17 18.13

7.92 4.78 16.85

40.22 2.42 21.58

5.01 1.47 12.24

2.27 12.77 2.76

Source: Pittsburgh Post Gazette.

SOLUTION

Step 1

Find the mean for the data. Use the formula shown in Chapter 3 or your calculator. The mean . Assume the standard deviation of the population is 14.405.

Step 2

Find α/2. Since the 90% confidence interval is to be used, α = 1 − 0.90 = 0.10, and

Step 3

Find z_α/2. Subtract 0.05 from 1.000 to get 0.9500. The corresponding z value obtained from Table E is 1.65. (Note: This value is found by using the z value for an area between 0.9495 and 0.9505. A more precise z value obtained mathematically is 1.645 and is sometimes used; however, 1.65 will be used in this text.)

Step 4

Substitute in the formula

Hence, one can be 90% confident that the population mean of the assets of all credit unions is between $6.752 million and $15.430 million, based on a sample of 30 credit unions.

When σ is unknown, s can be used as an estimate of σ, but a different distribution is used for the critical values. This method is explained in Section 7–3.

Comment to Computer and Statistical Calculator Users

This chapter and subsequent chapters include examples using raw data. If you are using computer or calculator programs to find the solutions, the answers you get may vary somewhat from the ones given in the text. This is so because computers and calculators do not round the answers in the intermediate steps and can use 12 or more decimal places for computation. Also, they use more-exact critical values than those given in the tables in the back of this book.

When you are calculating other statistics, such as the z, t, χ², or F values (shown in this chapter and later chapters), it is permissible to carry out the values of means, variances, and standard deviations to more decimal places than specified by the rounding rules in Chapter 3. This will give answers that are closer to the calculator or computer values. These small discrepancies are part of statistics.

Sample size determination is closely related to statistical estimation. Quite often you ask, How large a sample is necessary to make an accurate estimate? The answer is not simple, since it depends on three things: the margin of error, the population standard deviation, and the degree of confidence. For example, how close to the true mean do you want to be (2 units, 5 units, etc.), and how confident do you wish to be (90, 95, 99%, etc.)? For the purposes of this chapter, it will be assumed that the population standard deviation of the variable is known or has been estimated from a previous study.

The formula for sample size is derived from the margin of error formula

and this formula is solved for n as follows:

Hence,

Formula for the Minimum Sample Size Needed for an Interval Estimate of the Population Mean

where E is the margin of error. If necessary, round the answer up to obtain a whole number. That is, if there is any fraction or decimal portion in the answer, use the next whole number for sample size n.

EXAMPLE 7–4Automobile Thefts

A sociologist wishes to estimate the average number of automobile thefts in a large city per day within 2 automobiles. He wishes to be 99% confident, and from a previous study the standard deviation was found to be 4.2. How many days should he select to survey?

SOLUTION

Since α = 0.01 (or 1 − 0.99), z_α/2 = 2.58 and E = 2. Substitute in the formula

Round the value up to 30. Therefore, to be 99% confident that the estimate is within 2 automobiles of the true mean, the sociologist needs to sample the thefts for at least 30 days.

In most cases in statistics, we round off; however, when determining sample size, we always round up to the next whole number.

Notice that when you are finding the sample size, the size of the population is irrelevant when the population is large or infinite or when sampling is done with replacement. In other cases, an adjustment is made in the formula for computing sample size. This adjustment is beyond the scope of this book.

The formula for determining sample size requires the use of the population standard deviation. What happens when σ is unknown? In this case, an attempt is made to estimate σ. One such way is to use the standard deviation s obtained from a sample taken previously as an estimate for σ. The standard deviation can also be estimated by dividing the range by 4.

Sometimes, interval estimates rather than point estimates are reported. For instance, you may read a statement: “On the basis of a sample of 200 families, the survey estimates that an American family of two spends an average of $84 per week for groceries. One can be 95% confident that this estimate is accurate within $3 of the true mean.” This statement means that the 95% confidence interval of the true mean is

The algebraic derivation of the formula for a confidence interval is shown next. As explained in Chapter 6, the sampling distribution of the mean is approximately normal when large samples (n ≥ 30) are taken from a population. Also,