A Four-Class Table of Counts

Mendel's famous peas produced 315 yellow round phenotypes, 101 yellow wrinkled, 108 green round and 32 green wrinkled offspring (a total of 556):

observed<-c(315,101,108,32)

The question is whether these data depart significantly from the 9:3:3:1 expectation that would arise if there were two independent 3:1 segregations (with round seeds dominating wrinkled, and yellow seeds dominating to green)?

Because the null hypothesis is not a 25:25:25:25 distribution across the four categories, we need to calculate the expected frequencies explicitly:

(expected<-556*c(9,3,3,1)/16)

312.75 104.25 104.25 34.75

The expected frequencies are very close to the observed frequencies in Mendel's experiment, but we need to quantify the difference between them and ask how likely such a difference is to arise by chance alone:

chisq.test(observed,p=c(9,3,3,1),rescale.p=TRUE)
          Chi-squared test for given probabilities

data:     observed
X-squared = 0.47, df = 3, p-value = 0.9254

Note the use of different probabilities for the four phenotypes p=c(9,3,3,1). Because these values do not sum to 1.0, we require the extra argument rescale.p=TRUE. A difference as big as or bigger than the one observed will arise by chance alone in more than 92% of cases and is clearly not statistically significant. The chi-squared value is

sum((observed-expected)^2/expected)

[1] 0.470024

and the p-value comes from the right-hand tail of the cumulative probability function of the chi-squared distribution ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.