Rule of Thumb for multivariable linear regression analysis

Rule-of-thumb suggested by Green(1991)(https://doi.org/10.1207/s15327906mbr2603_7) : "Some support was obtained for a rule-of-thumb that $N ≥ 50 + 8 m $ for the multiple correlation and $N ≥104 + m$ for the partial correlation." I would like to determine the upper limit of the number of covariates $(m)$ according to the rule-of-thumb suggested by Green in an analysis using a general linear model with a fixed sample size $N$ . The purpose is to verify the statistical significance of the regression coefficient of a predictor after controlling for the effects of several covariates. In this case, which formula is appropriate for the model? $N ≥ 50 + 8 m$ or $N ≥104 + m$ ?

147k 89 89 gold badges 404 404 silver badges 712 712 bronze badges asked Aug 31, 2021 at 10:25 11 3 3 bronze badges

1 Answer 1

$\begingroup$

The "multiple correlation" is the positive square root of the multiple regression model's $R^2$ . The "partial correlation" is referring to a specific coefficient within that model. Since you want to verify a pre-specified coefficient, you want the latter (i.e., $N ≥104 + m$ ).

However, these rules of thumb pertain to minimum sample sizes to ensure the model isn't 'approaching saturation', which isn't necessarily your primary concern. As someone once said, 'the best rule of thumb is to be wary of rules of thumb'.

A better approach would be to conduct a power analysis. Specifically, you want to conduct a sensitivity type or a post-hoc type power analysis. That is, given your sample size, what is the smallest correlation you would have your preferred level of power (often 80%) to detect (s), or what would be your power to detect your preferred correlation (ph). First, subtract $1$ from your $N$ for every degree of freedom your covariates will consume, set alpha at, oh, I don't know, say, $.05$ , and solve for the correlation by stipulating a level of power, or solve for the power by stipulating a correlation. It is possible your analysis would not be worth pursuing, even if your $N$ exceeds the rule of thumb, or that you are likely to be OK, even if your $N$ does not exceed the rule of thumb.