O'Reilly logo
  • Sagar Mainkar thinks this is interesting:

A key strength of Parquet is its ability to store data that has a deeply nested structure in true columnar fashion. This is important since schemas with several levels of nesting are common in real-world systems. Parquet uses a novel technique for storing nested structures in a flat columnar format with little overhead, which was introduced by Google engineers in the Dremel paper.[86] The result is that even nested fields can be read independently of other fields, resulting in significant performance improvements.


Cover of Hadoop: The Definitive Guide, 4th Edition


Strength of Parquet