What is big data?

Big data is a relatively new term which has been gathering steam over the past few years. Big data is a term used for datasets that are relatively large to be stored in a traditional database system or processed by traditional data-processing pipelines. This data could be structured, semi-structured, or unstructured data. The datasets that belong to this category usually scale to terabytes or petabytes of data. Big data usually involves one or more of the following:

  • Velocity: Data moves at an unprecedented speed and must be dealt with it in a timely manner.

For example, online systems, sensors, social media, web clickstream, and so on.

  • Volume: Organizations collect data from a variety of sources, including business transactions, ...

Get Learning Apache Cassandra - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.