Using a left semi join

In this recipe, you will learn how to use a left semi join in Hive.

The left semi join is used in place of the IN/EXISTS sub-query in Hive. In a traditional RDBMS, the IN and EXISTS clauses are widely used whereas in Hive, the left semi join is used as a replacement of the same.

In the left semi join, the right-hand side table can only be used in the join clause but not in the WHERE or the SELECT clause.

The general syntax of the left semi join is as follows:

join_condition
  | table_reference LEFT SEMI JOIN table_reference join_condition

Where:

  • table_reference: Is the table name or the joining table that is used in the join query. table_reference can also be a query alias.
  • join_condition: join_condition: Is the join clause that ...

Get Apache Hive Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.