Chapter 7. Hadoop and SQL

MapReduce is a powerful paradigm that enables complex data processing that can reveal valuable insights. As discussed in earlier chapters however, it does require a different mindset and some training and experience on the model of breaking processing analytics into a series of map and reduce steps. There are several products that are built atop Hadoop to provide higher-level or more familiar views of the data held within HDFS, and Pig is a very popular one. This chapter will explore the other most common abstraction implemented atop Hadoop: SQL.

In this chapter, we will cover the following topics:

  • What the use cases for SQL on Hadoop are and why it is so popular
  • HiveQL, the SQL dialect introduced by Apache Hive
  • Using HiveQL ...

Get Learning Hadoop 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.