Hive query optimizers

After type checking and semantic analysis of the query, a number of rule-based transformations are applied to optimize the query. We will discuss some of these optimizations here. Custom optimizations can be written by implementing the org.apache.hadoop.hive.ql.optimizer.Transform interface. This interface has one method that takes in a ParseContext object and returns another after the transformation. The ParseContext object has the current operator tree, among other information.

The following are the few optimizations that are already available with Hive 0.13.0:

  • ColumnPruner: This operator tree is walked to determine the minimal number of columns in the base table that are required to fulfill the query. Any additional columns ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.