Hierarchical Sampling and Variance Components Analysis

Hierarchical data are often encountered in observational studies where information is collected at a range of different spatial scales. Consider an epidemiological study of childhood diseases in which blood samples were taken for individual children, households, streets, districts, towns, regions, and countries. All these categorical variables are random effects. The spatial scale increases with each step in the hierarchy. The interest lies in discovering where most of the variation originates: is it between children within households or between districts within the same town? When it comes to testing hypotheses at larger spatial scales (such as town or regions), such data sets contain huge amounts of pseudoreplication.

The following example has a slightly simpler spatial structure than this: infection is measured for two replicate males and females within each of three families within four streets within three districts within five towns (720 measurements in all). We want to carry out a variance components analysis. Here are the data:

hierarchy<-read.table("c:\\temp\\hre.txt",header=T)
attach(hierarchy)
names(hierarchy)

[1]  "subject"  "town"      "district"    "street"    "family"
[6]  "gender"   "replicate"

library(nlme)
library(lattice)
model1<-lme(subject~1,random=~1|town/district/street/family/gender)
summary(model1) Linear mixed-effects model fit by REML Data: NULL AIC BIC logLik 3351.294 3383.339 -1668.647 Random effects: Formula: ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.