+254 786 524680

Confidence Intervals and Sample Size

Today I will tech you about  Confidence Intervals and Sample Size

The main objective of this section is to explain the basics of estimating a parameter such as a mean, proportion, or variance.

Sample measures (i.e., statistics) are used to estimate population measures (i.e., parameters). For example, a sample mean is used to estimate a population mean. A sample proportion is used to estimate a population proportion. These statistics are called estimators.

Suppose a college president wishes to estimate the average age of students attending classes this semester. The president could select a random sample of 100 students and find the average age of these students, say, 22.3 years. From the sample mean, the president could infer that the average age of all the students is 22.3 years. This type of estimate is called a point estimate.

Another example might be that a restaurant owner wishes to see what proportion of Americans purchase take-out food every day. A random sample of 1,000 individuals shows that 0.11 or 11% of the people in the sample purchase take-out food every day. The same is true of other statistics. These types of estimates are called point estimates.

point estimate is a specific numerical value estimate of a parameter. The best point estimate of the population mean μ is the sample mean .

A good estimator should satisfy the three properties described next.

Three Properties of a Good Estimator

  1. The estimator should be an unbiased estimator. That is, the expected value or the mean of the estimates obtained from samples of a given size is equal to the parameter being estimated.
  2. The estimator should be consistent. For a consistent estimator, as sample size increases, the value of the estimator approaches the value of the parameter estimated.
  3. The estimator should be a relatively efficient estimator. That is, of all the statistics that can be used to estimate a parameter, the relatively efficient estimator has the smallest variance.

As stated in Chapter 6, the sample mean will be, for the most part, somewhat different from the population mean due to sampling error. Therefore, you might ask a second question: How good is a point estimate? The answer is that there is no way of knowing how close a particular point estimate is to the population mean.

This answer creates some doubt about the accuracy of point estimates. For this reason, statisticians prefer another type of estimate, called an interval estimate.

An interval estimate of a parameter is an interval or a range of values used to estimate the parameter. This estimate may or may not contain the value of the parameter being estimated.

In an interval estimate, the parameter is specified as being between two values. For example, an interval estimate for the average age of all students might be 21.9 < μ < 22.7, or 22.3 ± 0.4 years.

Also, an interval estimate for a proportion is the proportion of adults who purchase fast-food every day is between 0.107 and 0.113 or between 10.7% and 11.3% or 11% ± 0.3%.

Either the interval contains the parameter or it does not. A degree of confidence (usually a percent) must be assigned before an interval estimate is made. For instance, you may wish to be 95% confident that the interval contains the true population mean. Another question then arises. Why 95%? Why not 99 or 99.5%?

If you desire to be more confident, such as 99 or 99.5% confident, then you must make the interval larger. For example, a 99% confidence interval for the mean age of college students might be 21.7 < μ < 22.9, or 22.3 ± 0.6. Hence, a tradeoff occurs. To be more confident that the interval contains the true population mean, you must make the interval wider.

The confidence level of an interval estimate of a parameter is the probability that the interval estimate will contain the parameter, assuming that a large number of samples are selected and that the estimation process on the same parameter is repeated.

confidence interval is a specific interval estimate of a parameter determined by using data obtained from a sample and by using the specific confidence level of the estimate.

Intervals constructed in this way are called confidence intervals. Three common confidence intervals are used: the 90%, the 95%, and the 99% confidence intervals.

A population has a fixed value for the mean or proportion and when a confidence interval is constructed from a sample, it either includes these parameters or it won’t. If it were possible to construct 95% confidence intervals for a statistic such as a mean or proportion for all possible samples of a specific size, we can conclude that 95% of them would contain the population parameter. See Figure 7–1.FIGURE 7–195% Confidence Intervals for Each Sample Mean

You can be 95% confident that the population mean is contained within that interval when the values of the variable are normally distributed in the population.

The value used for the 95% confidence interval, 1.96, is obtained from Table E in Appendix A. For a 99% confidence interval, the value 2.58 is used instead of 1.96 in the formula. This value is also obtained from Table E and is based on the standard normal distribution. Since other confidence intervals are used in statistics, the symbol zα/2 (read “zee sub alpha over two”) is used in the general formula for confidence intervals. The Greek letter α (alpha) represents the total area in both tails of the standard normal distribution curve, and α/2 represents the area in each one of the tails. The value zα/2 is called a critical value.

The relationship between α and the confidence level is that the stated confidence level is the percentage equivalent to the decimal value of 1 − α, and vice versa. When the 95% confidence interval is to be found, α = 0.05, since 1 − 0.05 = 0.95, or 95%. When α = 0.01, then 1 − α = 1 − 0.01 = 0.99, and the 99% confidence interval is being calculated.

For a specific value, say α = 0.05, 95% of the sample means or proportions will fall within this error value on either side of the population mean or proportion. See Figure 7–2.FIGURE 7–2zα/2 values for a 95% Confidence Interval

The margin of error, also called the maximum error of the estimate, is the maximum likely difference between the point estimate of a parameter and the actual value of the parameter.

In the age of students example, where the confidence interval is 21.9 < μ < 22.7 or 22.3 ± 0.4, 0.4 is the margin of error. In the proportion of the example where the confidence interval is 10.7% < p < 11.3% or 11% + 0.3%, 0.003 or 0.3% is the margin of error. The formulas for confidence intervals and margins of errors are shown in the next section.

Data Analytics Services
Need Our Services?
Econometrics & Statistics Modelling Services
Need Help, Whatsapp Us Now