Order matters in summary.aov

People are often disconcerted by the ANOVA table produced by summary.aov in analysis of covariance. Compare the tables produced for these two models:

summary.aov(lm(weight~sex*age))
Df    Sum Sq   Mean Sq    F value     Pr(>F)
sex          1    90.492    90.492    107.498  1.657e-08 ***
age          1   269.705   269.705    320.389  5.257e-12 ***
sex:age      1    13.150    13.150     15.621   0.001141 ***
Residuals   16    13.469     0.842

summary.aov(lm(weight-age*sex))

            Df    Sum Sq   Mean Sq    F value     Pr(>F)
age          1   269.705   269.705    320.389  5.257e-12 ***
sex          1    90.492    90.492    107.498  1.657e-08 ***
age:sex      1    13.150    13.150     15.621   0.001141 ***
Residuals   16    13.469     0.842

Exactly the same sums of squares and p values. No problem. But look at these two models from the plant compensation example analysed in detail earlier (p. 490):

summary.aov(lm(Fruit-Grazing*Root))
            Df    Sum Sq   Mean Sq    F value     Pr(>F)
Grazing      1    2910.4    2910.4    62.3795  2.262e-09 ***
Root         1   19148.9   19148.9   410.4201   <2.2e-16 ***
Grazing:Root 1       4.8       4.8     0.1031       0.75
Residuals   36    1679.6      46.7

summary.aov(lm(Fruit-Root*Grazing))

            Df    Sum Sq   Mean Sq    F value     Pr(>F)
Root         1   16795.0   16795.0   359.9681  < 2.2e-16 ***
Grazing      1    5264.4    5264.4   112.8316  1.209e-12 ***
Root:Grazing 1       4.8       4.8     0.1031       0.75
Residuals   36    1679.6      46.7

In this case the order of variables within the model formula has a huge effect: it changes the sum of squares associated with the two main effects (root size is continuous and grazing is categorical, grazed or ungrazed) and alters their p values. The interaction term, the residual ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.