Chapter 9. Working with Relational Databases

As we saw in the previous chapter, Hive is a great tool that provides a relational database-like view of the data stored in Hadoop. However, at the end of the day, it is not truly a relational database. It does not fully implement the SQL standard, and its performance and scale characteristics are vastly different (not better or worse, just different) from a traditional relational database.

In many cases, you will find a Hadoop cluster sitting alongside and used with (not instead of) relational databases. Often the business flows will require data to be moved from one store to the other; we will now explore such integration.

In this chapter, we will:

  • Identify some common Hadoop/RDBMS use cases
  • Explore how ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.