Time for action – importing data from a raw query

Let's see an example of an import where a raw SQL statement is used to select the data to be imported.

  1. Delete any existing output directory:
    $ hadoop fs –rmr employees
    
  2. Drop any existing Hive employee table:
    $ hive -e 'drop table employees'
    
  3. Import data using an explicit query:
    sqoop import --connect jdbc:mysql://10.0.0.100/hadooptest 
    --username hadoopuser -P
    --target-dir employees  
    --query 'select first_name, dept, salary, 
    timestamp(start_date) as start_date from employees where $CONDITIONS' 
    --hive-import --hive-table employees 
    --map-column-hive start_date=timestamp -m 1
    
  4. Examine the created table:
    $ hive -e "describe employees"
    

    You will receive the following response:

    OK
    first_name  string  

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.