Time for action – performing a join
Joins are a very frequently used tool in SQL, though sometimes appear a little intimidating to those new to the language. Essentially a join allows rows in multiple tables to be logically combined together based on a conditional statement. Hive has rich support for joins which we will now examine.
- Create the following as
join.hql
:SELECT t1.sighted, t2.full_name FROM ufodata t1 JOIN states t2 ON (LOWER(t2.abbreviation) = LOWER(SUBSTR( t1.sighting_location, (LENGTH(t1.sighting_location)-1)))) LIMIT 5 ;
- Execute the query:
$ hive -f join.hql
You will receive the following response:
OK 20060930 Alaska 20051018 Alaska 20050707 Alaska 20100112 Alaska 20100625 Alaska Time taken: 33.255 seconds
What just happened?
The ...
Get Hadoop Beginner's Guide now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.