Variety

This dimension talks about the form and shape of the data. We can further classify this into the following categories:

  • Streaming data:
    • On-wire data format (for example, JSON, MPEG, and Avro)
  • Data At Rest:
    • Immutable data (for example, media files and customer invoices)
    • Mutable data (for example, customer details, product inventory, and employee data)
  • Application data:
    • Configuration files, secrets, passwords, and so on

As an organization, it's very important to embrace very few technologies to reduce the variety of data. Having many different types of data poses a very big challenge to an Enterprise in terms of managing and consuming it all.

Get Modern Big Data Processing with Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.