Chapter 4. Advanced Hive

SQL is a popular data-processing language that has been around for four decades. There are scores of people who are already familiar with Relational Data Stores and SQL. A natural step in onboarding more users onto Hadoop is to flatten the learning curve by bringing in concepts they are well versed with. Hive introduces relational and SQL concepts into Hadoop MapReduce. In the chapter on Pig, you saw the advanced usage of Pig scripts to author MapReduce workflows. In this chapter, we will delve into the advanced usage of Hive.

Apache Hive is often described as a data warehouse infrastructure. Traditionally, business intelligence is gathered from a data warehouse, a database that stores data from many sources within an enterprise. ...

Get Mastering Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.