4. The Quality of the Regression Equation (3/3)

The Regression Degrees of Freedom or RDF, is the number of independent variables and for simple linear regression, we have seen that this is 1 whilst N is the sample population used to compute the regression equation. We have seen that (N - 1) is used to derive the variance for a sample population as this provides an unbiased estimator of the variance. We do this because one parameter, the mean, had to be computed from the data prior to the computation of the variance. With simple linear regression we have to derive two parameters (b0 and b1), and so we subtract 2 from N in the denominator, although we put it in the form above for more general applicability.

So, you have calculated the F statistic. Where is the boundary value between accepting and rejecting the NUL hypothesis? We can get this boundary from a set of tables, two of which are given here. Select the confidence that you want to have in the decision; the two tables shown have a level of significance of 2.5% and 1% for Tables 1 and 2 respectively, which means that you can be 97.5% and 99% confident that the NUL hypothesis should be accepted below the value given in the table and rejected above this value. In our example, we have F statistic values of 367.15 for the Winter Wheat linear regression and 196.87 for the Spring Barley linear regression. If we want to be 99% confident of the result, then use the 0.01 tables, using column 1, for one degree of freedom in the numerator, and rows 60 for Winter Wheat and 51 for Spring Barley. Our table only goes down to 30 degrees of freedom, but even at this smaller number, the value in the table is much less than the values we have. As a consequence we can confidently reject the NUL hypothesis and accept the regression as being statistically significant.

Exercises

  1. In your spreadsheet, find the Coefficient of Determination for the Winter Wheat and Spring Barley and the F statistic. Use the F statistic tables to check the results given here in the text, at both 97.5% and 99% confidence levels.