Chapter 1The need for more than one random-effect term when fitting a regression line

1.1 A data set with several observations of variable Y at each value of variable X

One of the commonest, and simplest, uses of statistical analysis is the fitting of a straight line, known for historical reasons as a regression line, to describe the relationship between an explanatory variable, X and a response variable, Y. The departure of the values of Y from this line is called the residual variation, and is regarded as random. It is natural to ask whether the part of the variation in Y that is explained by the relationship with X is more than could reasonably be expected by chance: or more formally, whether it is significant relative to the residual variation. This is a simple regression analysis, and for many data sets it is all that is required. However, in some cases, several observations of Y are taken at each value of X. The data then form natural groups, and it may no longer be appropriate to analyse them as though every observation were independent: observations of Y at the same value of X may lie at a similar distance from the line. We may then be able to recognize two sources of random variation, namely

  • variation among groups
  • variation among observations within each group.

This is one of the simplest situations in which it is necessary to consider the possibility that there may be more than a single stratum of random variation—or, in the language of mixed modelling, that a model ...

Get Introduction to Mixed Modelling: Beyond Regression and Analysis of Variance, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.