Chapter 8 Data Profiling

Finally we are going to talk about data. Most of the book so far has basically centered on planning and infrastructure—what goes into the project before you actually start. At this point, it is time to get our hands dirty—and I really mean it! No matter what any external consultant tells you about the status of what is in your data, there is no excuse not to settle down for a good hard look at your data sets just to see whether they really display the characteristics you think they do. This process, a large part of which can be automated, is referred to as data profiling.

The goal of profiling data is to discover metadata when it is not available and to validate metadata when it is available. Data profiling is a process ...

Get Business Intelligence now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.