Advanced aggregation – ROLLUP and CUBE

The ROLLUP statement enables a SELECT statement to calculate multiple levels of aggregations across a specified group of dimensions. The ROLLUP statement is a simple extension to the GROUP BY clause with high efficiency and minimal overhead to a query. Compared to GROUPING SETS that creates specified levels of aggregations, ROLLUP creates n+1 levels of aggregations, where n is the number of grouping columns. First, it calculates the standard aggregate values specified in the GROUP BY clause. Then, it creates higher-level subtotals, moving from right to left through the list of combinations of grouping columns, as shown in the following example:

GROUP BY a,b,c WITH ROLLUP

This is equivalent to the following: ...

Get Apache Hive Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.