Avro serialization

Avro is a popular data serialization framework that is part of Apache Software Foundation. Its key features are as follows:

  • It supports a number of data structures for serialization.
  • It is neutral to particular programming languages and provides fast and compact binary serialization.
  • Code generation is optional in Avro. Data can be read, written, or used in RPCs without having to generate classes or code.

Avro uses schemas during the reading and writing of data. Schemas make the compact representation of the serialized object conducive. The self-describing capability of schemas makes it possible to get rid of object-type metadata to be present along with the serialized byte stream, the method used in Java serialization. The schemas ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.