Chapter 11 The Value of Parallelism

If we were to assess the size of the data sets that are common in any major industry, we would not be surprised at how much data is being created, collected, and stored, even before considering aggregating this data into any sort of data warehouse. It would not be unusual to see large customer databases, accompanied by transaction data sets (e.g., orders, call detail records [CDRs], insurance policies) that are one or two orders of magnitude larger. For example, it would not be unreasonable to expect a telecommunications company to log millions, if not billions, of CDRs each day; a data set containing a year’s worth of CDRs can easily exceed a terabyte of data. Add in party reference data, customer service ...

Get Business Intelligence now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.