Summary

In its early days, Hadoop was sometimes erroneously seen as the latest supposed relational database killer. Over time, it has become more apparent that the more sensible approach is to view it as a complement to RDBMS technologies and that, in fact, the RDBMS community has developed tools such as SQL that are also valuable in the Hadoop world.

HiveQL is an implementation of SQL on Hadoop and was the primary focus of this chapter. In regard to HiveQL and its implementations, we covered the following topics:

  • How HiveQL provides a logical model atop data stored in HDFS in contrast to relational databases where the table structure is enforced in advance
  • How HiveQL supports many standard SQL data types and commands including joins and views
  • The ...

Get Learning Hadoop 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.