Using Hive

With our Hive installation, we will now import and analyze the UFO data set introduced in Chapter 4, Developing MapReduce Programs.

When importing any new data into Hive, there is generally a three-stage process:

  1. Create the specification of the table into which the data is to be imported.
  2. Import the data into the created table.
  3. Execute HiveQL queries against the table.

This process should look very familiar to those with experience with relational databases. Hive gives a structured query view of our data and to enable that, we must first define the specification of the table's columns and import the data into the table before we can execute any queries.

Note

We assume a general level of familiarity with SQL and will be focusing more on how ...

Get Hadoop Beginner's Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.