Merging data

Merging data enables us to understand how different data sources relate to each other. The merge operation in R is similar to the join operation in a database, which combines fields from two datasets using values that are common to each.

Getting ready

Refer to the Converting data types recipe and convert each attribute of imported data into the proper data type. Also, rename the columns of the employees and salaries datasets by following the steps from the Renaming the data variable recipe.

How to do it…

Perform the following steps to merge salaries and employees:

  1. As employees and salaries are common in emp_no, we can merge these two datasets using emp_no as the join key:
    > employees_salary  <- merge(employees, salaries, by="emp_no")
    

Get R for Data Science Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.