Dealing with big data

Big data existed long before the phrase was invented. For instance, banks and stock exchanges have been processing billions of transactions daily for years and airline companies have worldwide real-time infrastructures for operational management of passenger booking, and so on. So, what is big data really? Doug Laney (2001) suggested that big data is defined by three Vs: volume, velocity, and variety. Therefore, to answer the question of whether your data is big, we can translate this into the following three sub-questions:

  • Volume: Can you store your data in memory?
  • Velocity: Can you process new incoming data with a single machine?
  • Variety: Is your data from a single source?

If you answered all of these questions with ...

Get Machine Learning in Java - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.