Model Simplification in Hierarchical Sampling

We need to know whether all of the random effects are required in the model. The key point to grasp here is that you will need to recode the factor levels if you want to leave out a random effect from a larger spatial scale. Suppose we want to test the effect of leaving out the identity of the towns. Because the districts were originally coded with the same names within each town,

levels(district)

[1]  "d1"  "d2"  "d3"

we shall need to create 15 new, unique district names. Much the simplest way to do this is to use paste to combine the town names and the district names:

newdistrict<-factor(paste(town,district,sep=""))
levels(newdistrict)
[1]  "Ad1"  "Ad2"  "Ad3"  "Bd1"  "Bd2"  "Bd3"  "Cd1"  "Cd2"  "Cd3"  "Dd1"
[11] "Dd2"  "Dd3"  "Ed1"  "Ed2"  "Ed3"

In model2 we leave out the random effect for towns and include the new factor for districts:

model2<-lme(subject~1,random=~1|newdistrict/street/family/gender)
anova(model1,model2)

1       Model df        AIC      BIC     logLik   Test  L.Ratio  p-value
model1     1  7  3351.294  3383.339  -1668.647
model2     2  6  3350.524  3377.991  -1669.262 1 vs 2 1.229803   0.2674

Evidently there is no significant effect attributable to differences between towns (p = 0.2674).

The next question concerns differences between the districts. Because the streets within districts were all coded in the same way in the original dataframe, we need to create 60 unique codes for the different streets:

newstreet<-factor(paste(newdistrict,street,sep=""))
levels(newstreet) ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.