14

DATA CLEANSING

It may be surprising that a book on data quality doesn't talk about “cleansing” until nearly the end of the text. Yet, while data cleansing may be the first thing on one's mind when thinking about data quality, cleansing actually encompasses only a small part of the data quality universe. The reason for this is that if we have treated data quality properly, planned from the beginning, then we should not really have to worry about ever cleaning data, since it should already be fit for use by the time the user sees it.

In actuality, though, since we have been largely operating for 40 years without considering data quality a priori, we live in a world where data have been subject to significant entropy and there is a significant ...

Get Enterprise Knowledge Management now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.