Plots with multiple variables

Initial data inspection using plots is even more important when there are many variables, any one of which might contain mistakes or omissions. The principal plot functions when there are multiple variables are:

images

  • pairs for a matrix of scatterplots of every variable against every other;
  • coplot for conditioning plots where y is plotted against x for different values of z;
  • xyplot where a set of panel plots is produced.

We illustrate these functions with the ozone data.

The pairs function

With two or more continuous explanatory variables (i.e. in a multiple regression; see p. 433) it is valuable to be able to check for subtle dependencies between the explanatory variables. The pairs function plots every variable in the dataframe on the y axis against every other variable on the x axis: you will see at once what this means from the following example:

ozonedata<-read.table("c:\\temp\\ozone.data.txt",header=T)
attach(ozonedata)
names(ozonedata)

[1]  "rad"  "temp"  "wind"  "ozone"

The pairs function needs only the name of the whole dataframe as its first argument. We exercise the option to add a non-parametric smoother to the scatterplots:

pairs(ozonedata,panel=panel.smooth)

The response variables are named in the rows and the explanatory variables are named in the columns. In the upper row, labelled rad, the response variable (on the y axis) is solar radiation. ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.