Piecewise Regression

This kind of regression fits different functions over different ranges of the explanatory variable. For example, it might fit different linear regressions to the left- and right-hand halves of a scatterplot. Two important questions arise in piecewise regression:

  • how many segments to divide the line into;
  • where to position the break points on the x axis.

Suppose we want to do the simplest piecewise regression, using just two linear segments. Where do we break up the x values? A simple, pragmatic view is to divide the x values at the point where the piecewise regression best fits the response variable. Let's take an example using a linear model where the response is the log of a count (the number of species recorded) and the explanatory variable is the log of the size of the area searched for the species:

data<-read.table("c:\\temp\\sasilwood.txt",header=T)
attach(data)
names(data)

[1] "Species" "Area"

A quick scatterplot suggests that the relationship between log(Species) and log (Area) is not linear:

plot(log(Species)~log(Area),pch=16)

The slope appears to be shallower at small scales than at large. The overall regression highlights this at the model-checking stage:

model1<-lm(log(Species)~log(Area))
plot(log(Area),resid(model1))

The residuals are very strongly U-shaped (this plot should look like the sky at night).

If we are to use piecewise regression, then we need to work out how many straight-line segments to use and where to put the breaks. Visual ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.