O'Reilly logo

Real-Time Big Data Analytics by Shilpi Saxena, Sumit Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Working with Hive tables

In this section, we will discuss the integration of Spark SQL with Hive tables. We will see the process of executing the Hive queries in Spark SQL, which will help us in creating and analyzing Hive tables in HDFS.

Spark SQL provides the flexibility of directly executing Hive queries with our Spark SQL codebase. The best part is that the Hive queries are executed on the Spark cluster and we just require the setup of HDFS for reading and storing the Hive tables. In other words, there is no need to set up a complete Hadoop cluster with services like ResourceManager or NodeManager. We just need services of HDFS, which are available as soon as we start NameNode and DataNode.

Perform the following steps for creating Hive tables ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required