How to Find Degrees of Freedom | Definition & Formula

Degrees of freedom, often represented by v or df, is the number of independent pieces of information used to calculate a statistic. It’s calculated as the sample size minus the number of restrictions.

Degrees of freedom are normally reported in brackets beside the test statistic, alongside the results of the statistical test.

Example: Degrees of freedomSuppose you randomly sample 10 American adults and measure their daily calcium intake. You use a one-sample test to determine whether the mean daily intake of American adults is equal to the recommended amount of 1000 mg.

The test statistic, t, has 9 degrees of freedom:

df − 1

df = 10 − 1

df = 9

You calculate a value of 1.41 for the sample, which corresponds to a p value of .19. You report your results:

“The participants’ mean daily calcium intake did not differ from the recommended amount of 1000 mg, t(9) = 1.41, p = 0.19.”

What are degrees of freedom?

In inferential statistics, you estimate a parameter of a population by calculating a statistic of a sample. The number of independent pieces of information used to calculate the statistic is called the degrees of freedom. The degrees of freedom of a statistic depend on the sample size:

  • When the sample size is small, there are only a few independent pieces of information, and therefore only a few degrees of freedom.
  • When the sample size is large, there are many independent pieces of information, and therefore many degrees of freedom.
Note

Although degrees of freedom are closely related to sample size, they’re not the same thing. There are always fewer degrees of freedom than the sample size.

When you estimate a parameter, you need to introduce restrictions in how values are related to each other. As a result, the pieces of information are not all independent. To put it another way, the values in the sample are not all free to vary.

The following analogy and example show you what it means for a value to be free to vary and how it’s affected by restrictions.

Need Help with Researchers or Data Analysts, Lets Help you with Data Analysis & Result Interpretation for your Project, Thesis or Dissertation?

We are Experts in SPSS, EVIEWS, AMOS, STATA, R, and Python


Free to vary: Dessert analogy

Example: Dessert analogy

Imagine your roommate has a sweet tooth, so she’s thrilled to discover that your college cafeteria offers seven dessert options. One week, she decides that she wants to have a different dessert every day.

By deciding to have a different dessert every day, your roommate is imposing a restriction on her dessert choices.

On Monday, she can choose any of the seven desserts. On Tuesday, she can choose any of the six remaining dessert options. On Wednesday, she can choose any of the five remaining options, and so on.

By Sunday, she’s had all the dessert options except one. She doesn’t have any choice to make on Sunday since there’s only one option remaining.

Due to her restriction, your roommate could only choose her dessert on six of the seven days. Her dessert choice was free to vary on these six day. In contrast, her dessert choice on the last day wasn’t free to vary; it depended on her dessert choices of the previous six days.

Free to vary: Sum example

Example: Sum

Suppose I ask you to pick five integers that sum to 100.

The requirement of summing to 100 is a restriction on your number choices.

For the first number, you can choose any integer you want. Whatever your choice, the sum of the five numbers can still be 100. This is also true of the second, third, and fourth numbers.

You have no choice for the final number; it has only one possible value and it isn’t free to vary. For example, imagine you chose 15, 27, 42, and 3 as your first four numbers. For the numbers to sum to 100, the final number needs to be 13.

Due to the restriction, you could only choose four of the five numbers. The first four numbers were free to vary. In contrast, the fifth number wasn’t free to vary; it depended on the other four numbers.

Correction for autocorrelation

To correct the autocorrelation problem, use the ‘prais’ command instead of regression (same as when running regression), and the ‘corc’ command at last after the names of the variables.

Below is the command for correcting autocorrelation.

prais gdp gfcf pfce, corc

The below results will appear.

Figure 3: Regression results with correction of autocorrelation

Figure 3: Regression results with correction of autocorrelation in STATA

At the end of the results, finally, calculate original and new Durbin Watson statistics as follows.

Figure 4: Calculation of original and new Durbin Watson statistics

Figure 4: Calculation of original and new Durbin Watson statistics for autocorrelation in STATA

The New D-W statistic value is 2.0578 which lies between du and 4-du, implying that there is no autocorrelation now. Thus it has been corrected.

Furthermore, the next article discusses the issue of multicollinearity. Multicollinearity arises when two or more two explanatory variables in the regression model highly correlate with each other.

Data Analytics Services
Need Our Services?
Econometrics & Statistics Modelling Services
Need Help, Whatsapp Us Now