Performing Reduce side Joins using Map Reduce

In this recipe, we are going to learn how to write a map reduce, which will join records from two tables.

Getting ready

To perform this recipe, you should have a running Hadoop cluster as well as an eclipse that's similar to an IDE.

How to do it

We are aware of the various types of joins that are available in SQL—Inner Join, Left outer join, right outer join, full outer join, and so on. Performing joins in SQL is quite easy, but when it comes to MapReduce, this is a little tricky. In this recipe, we will be try to perform various join operations using the Map Reduce program in the following dataset.

Consider two datasets: the Users table, which has information about userId, username, and deptId. We also ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.