Chapter 4. Data Manipulation

This chapter covers the following topics:

  • Enhancing a data.frame with a data.table
  • Managing data with a data.table
  • Performing fast aggregation with a data.table
  • Merging large datasets with a data.table
  • Subsetting and slicing data with dplyr
  • Sampling data with dplyr
  • Selecting columns with dplyr
  • Chaining operations in dplyr
  • Arranging rows with dplyr
  • Eliminating duplicated rows with dplyr
  • Adding new columns with dplyr
  • Summarizing data with dplyr
  • Merging data with dplyr

Introduction

Most R users will agree that data frames provide a flexible and expressive structure for tabular data. While data frames are effective for small datasets, they are not ideal to use when processing data that is larger than a Gigabyte in size. Additionally, ...

Get R for Data Science Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.