10.2. Designing Scrubbing Tables

Let’s assume that you are moving data from a file into a working table for scrubbing. What should the target table look like? The usual answer is to make all the columns NVARCHAR (n), where (n) is the maximum size allowed by your particular SQL product. This is the most general data type, and it can hold all kinds of garbage. It is as close to mimicking a general sequential file as you can get in SQL.

The real shame about this schema design is that people do use it in their actual database and not just as a staging area for scrubbing bad data.

The first question to ask is whether you should be using NVARCHAR (n) or simply VARCHAR (n). If you allow a Unicode character set, you can catch some errors that might not ...

Get Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.