Merging datasets

Besides the previously described elementary actions on a single dataset, joining multiple data sources is one of the most used methods in everyday action. The most often used solution for such a task is to simply call the merge S3 method, which can act as a traditional SQL inner and left/right/full outer joiner of operations—represented in a brief summary by C.L. Moffatt (2008) as follows:

Merging datasets

The dplyr package provides some easy ways for doing the previously presented join operations right from R, in an easy way:

  • inner_join: This joins the variables of all the rows, which are found in both datasets
  • left_join: This includes all the rows ...

Get Mastering Data Analysis with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.