Incidence functions

In this example, the response variable is called incidence; a value of 1 means that an island was occupied by a particular species of bird, and 0 means that the bird did not breed there. The explanatory variables are the area of the island (km2) and the isolation of the island (distance from the mainland, km).

island<-read.table("c:\\temp\\isolation.txt",header=T)
attach(island)
names(island)

[1]  "incidence"  "area"  "isolation"

There are two continuous explanatory variables, so the appropriate analysis is multiple regression. The response is binary, so we shall do logistic regression with binomial errors.

We begin by fitting a complex model involving an interaction between isolation and area:

model1<-glm(incidence~area*isolation,binomial)

Then we fit a simpler model with only main effects for isolation and area:

model2<-glm(incidence~area+isolation,binomial)

We now compare the two models using ANOVA:

anova(model1,model2,test="Chi")

Analysis of Deviance Table
Model 1: incidence ~ area * isolation
Model 2: incidence ~ area + isolation
    Resid. Df  Resid. Dev    Df  Deviance P(>|Chi|)
1          46     28.2517
2          47     28.4022     -1   -0.1504   0.6981

The simpler model is not significantly worse, so we accept this for the time being, and inspect the parameter estimates and standard errors:

summary(model2)

Call:
glm(formula = incidence ~ area + isolation, family = binomial)

Deviance Residuals:
    Min        1Q   Median       3Q      Max
-1.8189   -0.3089   0.0490   0.3635   2.1192
Coefficients: Estimate Std.Error Z ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.