Chapter 8

Formula Notation and Complex Statistics

What you will learn in this chapter:

  • How to use formula notation for simple hypothesis tests
  • How to use formula notation in graphics
  • How to carry out analysis of variance (ANOVA)
  • How to conduct post-hoc tests
  • How the formula syntax can be used to define complex analytical models
  • How to carry out complex ANOVA
  • How to draw summary graphs of ANOVA
  • How to create interaction plots

The R program has great analytical power, but so far most of the situations you have seen are fairly simple. In the real world things are usually more complicated, and you need a way to describe these more complex situations to enable R to carry out the appropriate analytical routines. R uses a special syntax to enable you to define and describe more complex situations. You have already met the ~ symbol; you used this as an alternative way to describe your data when using simple hypothesis testing (see Chapter 6, “Simple Hypothesis Testing”) and also when visualizing the results graphically (see Chapter 7, “Introduction to Graphical Analysis”). This formula syntax permits more complex models to be defined, which is useful because much of the data you need to analyze is itself more complex than simply a comparison of two samples. In essence, you put the response variables on the left of the ~ and the predictor variable(s) on the right, like so:

response ~ predictor.1 + predictor.2

In this syntax you simply link the predictor variables using a + sign, but you ...

Get Beginning R: The Statistical Programming Language now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.