Applying functions to subsets of a vector

The tapply function applies a function to each partition of the dataset. Hence, when we need to evaluate a function over subsets of a vector defined by a factor, tapply comes in handy.

Getting ready

Download the files for this chapter and store the auto-mpg.csv file in your R working directory. Read the data and create factors for the cylinders variable:

> auto <- read.csv("auto-mpg.csv", stringsAsFactors=FALSE)
> auto$cylinders <- factor(auto$cylinders, levels = c(3,4,5,6,8), labels = c("3cyl", "4cyl", "5cyl", "6cyl", "8cyl"))

How to do it...

To apply functions to subsets of a vector, follow these steps:

  1. Calculate mean mpg for each cylinder type:
    > tapply(auto$mpg,auto$cylinders,mean) 3cyl 4cyl 5cyl 6cyl 8cyl ...

Get R: Recipes for Analysis, Visualization and Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.