O'Reilly logo

R Cookbook by Paul Teetor

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6. Data Transformations

Introduction

This chapter is all about the apply functions: apply, lapply, sapply, tapply, mapply; and their cousins, by and split. These functions let you take data in great gulps and process the whole gulp at once. Where traditional programming languages use loops, R uses vectorized operations and the apply functions to crunch data in batches, greatly streamlining the calculations.

Defining Groups Via a Factor

An important idiom of R is using a factor to define a group. Suppose we have a vector and a factor, both of the same length, that were created as follows:

> v <- c(40,2,83,28,58)
> f <- factor(c("A","C","C","B","C"))

We can visualize the vector elements and factors levels side by side, like this:

VectorFactor
40A
2C
83A
28B
58C

The factor level identifies the group of each vector element: 40 and 83 are in group A; 28 is in group B; and 2 and 58 are in group C.

In this book, I refer to such factors as grouping factors. They effectively slice and dice our data by putting them into groups. This is powerful because processing data in groups occurs often in statistics when comparing group means, comparing group proportions, performing ANOVA analysis, and so forth.

This chapter has recipes that use grouping factors to split vector elements into their respective groups (Recipe 6.1), apply a function to groups within a vector (Recipe 6.5), and apply a function to groups of rows within a data frame (Recipe 6.6). In other chapters, the same idiom is used to test ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required