Chapter 10. Scrubbing Data with Non-1 N F Tables

WE DO NOT always get perfect, clean data, so “data scrubbing” is an important function for a database. If you did not care about data quality, then the answer was always 42, to paraphrase Douglas Noël Adams (1952 to 2001) in the classic Hitchhiker’s Guide to the Galaxy series. Software to extract, transform, and load (ETL) data has become a niche in the software industry all to itself, but you can do a lot in SQL itself without special tools.

There will likely be some common problems that go with data from non-SQL sources. Old file system layouts will have to be reformatted and often split into many tables. Old encodings may have to be updated to current systems; for example, the United States ...

Get Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.