Chapter 5. Box Plots

The Box Plot

Sometimes, it can be helpful to look at summary information about a group of numbers instead of the numbers themselves. One type of graph that does this by breaking the data into well-defined ranges of numbers is the box plot. We will try this graph on a relatively large dataset, one with which our previous types of graphs do not work very well.

There are some interesting datasets in the nlme package. Get this package and load it by using these commands:

> install.packages("nlme")
> library(nlme)

Next, take a look at the MathAchieve dataset. With more than 7,000 rows, this is much larger than the datasets we have dealt with previously. What problems will this create for us if we want to examine the distribution of MathAch scores? Let’s see what happens with a strip chart of this data.

In the code that produces Figure 5-1, as in many following examples, the mfrow argument is used in par() to make multiple graphs appear on one page. The format is mfrow = c(i,j), where i is the number of rows of graphs and j is the number of columns:

# Figure 5-1
library(nlme)
par(mfrow=c(2,1)) # set up one graph above another: 2 rows/1 col

stripchart(MathAchieve$MathAch, method = "jitter",
  main = "a. Math Ach Scores,
  pch = '19'", xlab = "Scores", pch = "19")

stripchart(MathAchieve$MathAch, method = "jitter",
  main = "b. Math Ach Scores,
  pch = '.'", xlab = "Scores", pch = ".")
Figure 5-1. Strip charts of math achievement scores

These strip charts show the results ...

Get Graphing Data with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.