Splitting a dataset

When we have categorical variables, we often want to create groups corresponding to each level and to analyze each group separately to reveal some significant similarities and differences between groups.

The split function divides data into groups based on a factor or vector. The unsplit() function reverses the effect of split.

Getting ready

Download the files for this chapter and store the auto-mpg.csv file in your R working directory. Read the file using the read.csv command and save in the auto variable:

> auto <- read.csv("auto-mpg.csv", stringsAsFactors=FALSE)

How to do it...

Split cylinders using the following command:

> carslist <- split(auto, auto$cylinders)

How it works...

The split(auto, auto$cylinders) function returns a ...

Get R: Recipes for Analysis, Visualization and Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.