Aggregation

The most straightforward way of summarizing data is calling the aggregate function from the stats package, which does exactly what we are looking for: splitting the data into subsets by a grouping variable, then computing summary statistics for them separately. The most basic way to call the aggregate function is to pass the numeric vector to be aggregated, and a factor variable to define the splits for the function passed in the FUN argument to be applied. Now, let's see the average ratio of diverted flights on each weekday:

> aggregate(hflights$Diverted, by = list(hflights$DayOfWeek),
+   FUN = mean)
  Group.1           x
1       1 0.002997672
2       2 0.002559323
3       3 0.003226211
4       4 0.003065727
5       5 0.002687865
6       6 0.002823121
7       7 0.002589057

Well, it took ...

Get Mastering Data Analysis with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.