More examples

Effective big data statistical projects should be based on focused problem definitions. In other words, it is almost always an advantage to reduce the size of your data source (or reduce the size of the population) so that you can be more effective with the managing and manipulating of the data--yet still, produce meaningful (and correct) results.

The process of sampling or defining your population allows you the opportunity to cut down on the volume of data you need to physically process through or touch. This saves CPU cycles and more importantly, saves your time. This can also be referred to as cutting through the clutter (or noise) often so prevalent in big data sources.

Understanding that defining a population to work with on ...

Get Big Data Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.