O'Reilly logo

Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools by Deepak Vohra

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

© Deepak Vohra 2016

Deepak Vohra, Practical Hadoop Ecosystem, 10.1007/978-1-4842-2199-0_7

7. Apache Avro

Deepak Vohra

(1)Apt 105, White Rock, British Columbia, Canada

Apache Avro is a compact binary data serialization format providing varied data structures. Avro uses JSON notation schemas to serialize/deserialize data. Avro data is stored in a container file (an .avro file) and its schema (the .avsc file) is stored with the data file. Unlike some other similar systems such as Protocol buffers, Avro does not require code generation and uses dynamic typing. Data is untagged because the schema is accompanied with the data, resulting in a compact data file. Avro supports versioning; different versions (having different columns) of Avro data files may ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required