Extending HiveQL
The HiveQL language can be extended by means of plugins and third-party functions. In Hive, there are three types of functions characterized by the number of rows they take as input and produce as output:
- User Defined Functions (UDFs): are simpler functions that act on one row at a time.
- User Defined Aggregate Functions (UDAFs): take multiple rows as input and generate multiple rows as output. These are aggregate functions to be used in conjunction with a
GROUP BY
statement (similar toCOUNT()
,AVG()
,MIN()
,MAX()
, and so on). - User Defined Table Functions (UDTFs): take multiple rows as input and generate a logical table comprised of multiple rows that can be used in join expressions.
Tip
These APIs are provided only in Java. For other ...
Get Learning Hadoop 2 now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.