Chapter 7. Specialized Connectors

Due to its versatility, Sqoop transfers data from a variety of relational database systems, such as Oracle, MySQL, PostgreSQL, and Microsoft SQL Server, as well as from enterprise data warehouses, such as Netezza and Teradata. While working with these database systems, you may encounter issues specific to a system vendor. This chapter guides you through common installation, connection, and syntax issues.

Overriding Imported boolean Values in PostgreSQL Direct Import

Problem

PostgreSQL direct imports boolean values as TRUE or FALSE strings. If your subsequent processing expects different values, you need to override those defaults.

Solution

Specify the extra parameters --boolean-true-string and --boolean-false-string to override the default value to a different string. For example, to use 0 for false and 1 for true, you could use the following Sqoop command:

sqoop import \
  --connect jdbc:postgresql://postgresql.example.com/database \
  --username sqoop \
  --password sqoop \
  --direct \
  --table table_with_booleans \
  -- \
  --boolean-true-string 1 \
  --boolean-false-string 0

Discussion

The PostgreSQL direct connector uses the COPY (SELECT QUERY) TO STDOUT clause for retrieving data from your database that will by default use the string constants TRUE and FALSE when importing data from Boolean and Bit columns. The PostgreSQL direct connector only supports import and delegates the export to the nondirect JDBC connector. Therefore, both parameters, --boolean-true-string ...

Get Apache Sqoop Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.