Extensibility Considerations

Although Hive has provided many built-in functions, in special use cases, users may need power beyond what's provided. In this case, we can extend Hive's functionality in three main areas:

  • User-defined function (UDF): This provides a way to extend functionalities with an external function (mainly written in Java) that can be evaluated in HQL
  • HPL/SQL: This provides procedure-language-programming support to HQL
  • Streaming: This plugs a user's own customized programs in to the data streaming
  • SerDe: This stands for serialization and deserialization and provides a way to serialize or deserialize data with the customized file format

In this chapter, we'll talk about each of them in more detail.

Get Apache Hive Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.