Akaike's Information Criterion

Akaike's information criterion (AIC) is known in the statistics trade as a penalized log-likelihood. If you have a model for which a log-likelihood value can be obtained, then

images

where p is the number of parameters in the model, and 1 is added for the estimated variance (you could call this another parameter if you wanted to). To demystify AIC let's calculate it by hand. We revisit the regression data for which we calculated the log-likelihood by hand on p. 217.

attach(regression)
names(regression)

[1] "growth" "tannin"

growth

[1] 12 10 8 11 6 7 2 3 3

The are nine values of the response variable, growth, and we calculated the log-likelihood as −23.98941 earlier. There was only one parameter estimated from the data for these calculations (the mean value of y), so p = 1. This means that AIC should be

images

Fortunately, we do not need to carry out these calculations, because there is a built-in function for calculating AIC. It takes a model object as its argument, so we need to fit a one-parameter model to the growth data like this:

model<-lm(growth~1)

Then we can get the AIC directly:

AIC(model)

[1] 51.97882

AIC as a measure of the fit of a model

The more parameters that there are in the model, the better the fit. You could obtain a perfect fit if you had ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.