Chi-Square Goodness of Fit Test | Formula, Guide & Examples
A chi-square (Χ2) goodness of fit test is a type of Pearson’s chi-square test. You can use it to test whether the observed distribution of a categorical variable differs from your expectations.
The chi-square goodness of fit test tells you how well a statistical model fits a set of observations. It’s often used to analyze genetic crosses.
What is the chi-square goodness of fit test?
A chi-square (Χ2) goodness of fit test is a goodness of fit test for a categorical variable. Goodness of fit is a measure of how well a statistical model fits a set of observations.
- When goodness of fit is high, the values expected based on the model are close to the observed values.
- When goodness of fit is low, the values expected based on the model are far from the observed values.
The statistical models that are analyzed by chi-square goodness of fit tests are distributions. They can be any distribution, from as simple as equal probability for all groups, to as complex as a probability distribution with many parameters.
Hypothesis testing
The chi-square goodness of fit test is a hypothesis test. It allows you to draw conclusions about the distribution of a population based on a sample. Using the chi-square goodness of fit test, you can test whether the goodness of fit is “good enough” to conclude that the population follows the distribution.
With the chi-square goodness of fit test, you can ask questions such as: Was this sample drawn from a population that has…
- Equal proportions of male and female turtles?
- Equal proportions of red, blue, yellow, green, and purple jelly beans?
- 90% right-handed and 10% left-handed people?
- Offspring with an equal probability of inheriting all possible genotypic combinations (i.e., unlinked genes)?
- A Poisson distribution of floods per year?
- A normal distribution of bread prices?
After weeks of hard work, your dog food experiment is complete and you compile your data in a table:
Flavor | Observed | Expected |
Garlic Blast | 22 | 25 |
Blueberry Delight | 30 | 25 |
Minty Munch | 23 | 25 |
To help visualize the differences between your observed and expected frequencies, you also create a bar graph:
The president of the dog food company looks at your graph and declares that they should eliminate the Garlic Blast and Minty Munch flavors to focus on Blueberry Delight. “Not so fast!” you tell him.
You explain that your observations were a bit different from what you expected, but the differences aren’t dramatic. They could be the result of a real flavor preference or they could be due to chance.
To put it another way: You have a sample of 75 dogs, but what you really want to understand is the population of all dogs. Was this sample drawn from a population of dogs that choose the three flavors equally often?
Chi-square goodness of fit test hypotheses
Like all hypothesis tests, a chi-square goodness of fit test evaluates two hypotheses: the null and alternative hypotheses. They’re two competing answers to the question “Was the sample drawn from a population that follows the specified distribution?”
- Null hypothesis (H0): The population follows the specified distribution.
- Alternative hypothesis (Ha): The population does not follow the specified distribution.
These are general hypotheses that apply to all chi-square goodness of fit tests. You should make your hypotheses more specific by describing the “specified distribution.” You can name the probability distribution (e.g., Poisson distribution) or give the expected proportions of each group.
When to use the chi-square goodness of fit test
The following conditions are necessary if you want to perform a chi-square goodness of fit test:
- You want to test a hypothesis about the distribution of one categorical variable. If your variable is continuous, you can convert it to a categorical variable by separating the observations into intervals. This process is known as data binning.
- The sample was randomly selected from the population.
- There are a minimum of five observations expected in each group.
How to calculate the test statistic (formula)
The test statistic for the chi-square (Χ2) goodness of fit test is Pearson’s chi-square:
Formula | Explanation |
---|---|
|
The larger the difference between the observations and the expectations (O − E in the equation), the bigger the chi-square will be.
To use the formula, follow these five steps:
Step 1: Create a table
Create a table with the observed and expected frequencies in two columns.
Flavor | Observed | Expected |
Garlic Blast | 22 | 25 |
Blueberry Delight | 30 | 25 |
Minty Munch | 23 | 25 |
Step 2: Calculate O − E
Add a new column called “O − E”. Subtract the expected frequencies from the observed frequency.
Flavor | Observed | Expected | O − E |
Garlic Blast | 22 | 25 | 22 − 25 = −3 |
Blueberry Delight | 30 | 25 | 5 |
Minty Munch | 23 | 25 | −2 |
Step 3: Calculate (O − E)2
Add a new column called “(O − E)2”. Square the values in the previous column.
Flavor | Observed | Expected | O − E | (O − E)2 |
Garlic Blast | 22 | 25 | −3 | (−3)2 = 9 |
Blueberry Delight | 30 | 25 | 5 | 25 |
Minty Munch | 23 | 25 | −2 | 4 |
Step 4: Calculate (O − E)2 / E
Add a final column called “(O − E)² / E“. Divide the previous column by the expected frequencies.
Flavor | Observed | Expected | O − E | (O − E)2 | (O − E)² / E |
Garlic Blast | 22 | 25 | −3 | 9 | 9/25 = 0.36 |
Blueberry Delight | 30 | 25 | 5 | 25 | 1 |
Minty Munch | 23 | 25 | −2 | 4 | 0.16 |
Step 5: Calculate Χ2
Add up the values of the previous column. This is the chi-square test statistic (Χ2).
Flavor | Observed | Expected | O − E | (O − E)2 | (O − E)2 / E |
Garlic Blast | 22 | 25 | −3 | 9 | 9/25 = 0.36 |
Blueberry Delight | 30 | 25 | 5 | 25 | 1 |
Minty Munch | 23 | 25 | −2 | 4 | 0.16 |
Χ2 = 0.36 + 1 + 0.16 = 1.52
We are a Team of Qualified Researchers and Data Analysts. Lets Help you with Data Analysis & Result Interpretation for your Project, Thesis or Dissertation?
We are Experts in SPSS, EVIEWS, AMOS, STATA, R, and Python
How to perform the chi-square goodness of fit test
The chi-square statistic is a measure of goodness of fit, but on its own it doesn’t tell you much. For example, is Χ2 = 1.52 a low or high goodness of fit?
To interpret the chi-square goodness of fit, you need to compare it to something. That’s what a chi-square test is: comparing the chi-square value to the appropriate chi-square distribution to decide whether to reject the null hypothesis.
To perform a chi-square goodness of fit test, follow these five steps (the first two steps have already been completed for the dog food example):
Step 1: Calculate the expected frequencies
Sometimes, calculating the expected frequencies is the most difficult step. Think carefully about which expected values are most appropriate for your null hypothesis.
In general, you’ll need to multiply each group’s expected proportion by the total number of observations to get the expected frequencies.
Step 2: Calculate chi-square
Calculate the chi-square value from your observed and expected frequencies using the chi-square formula.
Step 3: Find the critical chi-square value
Find the critical chi-square value in a chi-square critical value table or using statistical software. The critical value is calculated from a chi-square distribution. To find the critical chi-square value, you’ll need to know two things:
- The degrees of freedom (df): For chi-square goodness of fit tests, the df is the number of groups minus one.
- Significance level (α): By convention, the significance level is usually .05.
Step 4: Compare the chi-square value to the critical value
Compare the chi-square value to the critical value to determine which is larger.
Step 5: Decide whether the reject the null hypothesis
- If the Χ2 value is greater than the critical value, then the difference between the observed and expected distributions is statistically significant (p < α).
- The data allows you to reject the null hypothesis and provides support for the alternative hypothesis.
- If the Χ2 value is less than the critical value, then the difference between the observed and expected distributions is not statistically significant (p > α).
- The data doesn’t allow you to reject the null hypothesis and doesn’t provide support for the alternative hypothesis.