A Contingency Table of Intermediate Complexity

We start with a three-dimenstional table of count data from college records. It is a contingency table with two levels of year (freshman and sophomore), two levels of discipline (arts and science), and two levels of gender (male and female):

numbers<-c(24,30,29,41,14,31,36,35)

The statistical question is whether the relationship between gender and discipline varies between freshmen and sophomores (i.e. we want to know the significance of the three-way interaction between year, discipline and gender).

The first task is to define the dimensions of numbers using the dim function.

dim(numbers)<-c(2,2,2)
numbers

, , 1
      [,1]   [,2]
[1,]    24     29
[2,]    30     41
, , 2

      [,1]   [,2]
[1,]    14     36
[2,]    31     35

The top table refers to the males [„1] and the bottom table to the females [„2]. Within each table, the rows are the year groups and the columns are the disciplines. It would make the table much easier to understand if we provided these dimensions with names using the dimnames function:

dimnames(numbers)[[3]] <- list("male", "female")
dimnames(numbers)[[2]] <- list("arts", "science")
dimnames(numbers)[[1]] <- list("freshman", "sophomore")

To see this as a flat table, use the ftable function like this

ftable(numbers)

                   male   female
freshman   arts      24       14
           science   29       36
sophomore  arts      30       31
           science   41       35

The thing to understand is that the dimnames are the factor levels (e.g. male or female), not the names of the factors (e.g. gender).

We convert this table ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.