Deviance: Measuring the Goodness of Fit of a GLM

The fitted values produced by the model are most unlikely to match the values of the data perfectly. The size of the discrepancy between the model and the data is a measure of the inadequacy of the model; a small discrepancy may be tolerable, but a large one will not be. The measure of discrepancy in a GLM to assess the goodness of fit of the model to the data is called the deviance. Deviance is defined as −2 times the difference in log-likelihood between the current model and a saturated model (i.e. a model that fits the data perfectly). Because the latter does not depend on the parameters of the model, minimizing the deviance is the same as maximizing the likelihood.

Deviance is estimated in different ways for different families within glm (Table 13.1). Numerical examples of the calculation of deviance for different glm families are given in Chapter 14 (Poisson errors), Chapter 15 (binomial errors), and Chapter 24 (gamma errors). Where there is grouping structure in the data, leading to spatial or temporal pseudoreplication, you will want to use generalized mixed models (lmer) with one of these error families (p. 590).

Table 13.1. Deviance formulae for different GLM families where y is observed data, images the mean value of y, μ are the fitted values of y from the maximum likelihood model, and n is the binomial denominator in a binomial ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.