O'Reilly logo

Knight's 24-Hour Trainer: Microsoft® SQL Server® 2008 Integration Services by Mike Davis, Devin Knight, Brian Knight

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 26. Handling Bad Data with the Fuzzy Lookup Transform

More often than not when you are working in the real world, data is not going to be perfect like it is in the AdventureWorks database. Real-world situations call for cleansing dirty data or data that has abnormalities like misspellings and truncation.

Imagine you are attempting to retrieve a foreign key from a dimension table, but strangely you find rows without a match. Upon investigation you find bad data is being supplied to you. One technique might be to divert these rows without matches to a table to be dealt with later, or another might be just to add the bad data regardless of misspellings and other mishaps that occur during data entry.

The Fuzzy Lookup Transform, discussed in this lesson, and Fuzzy Grouping Transform, discussed in the next lesson, give another alternative to dealing with dirty data while reducing your number of unmatched rows. The Fuzzy Lookup Transform matches input records with data that has already been cleansed in a reference table. It returns the match and can also indicate the quality of the match. This way you know the likelihood of the match being correct.

A best practice tip is to use the Fuzzy Lookup Transform only after trying a regular lookup on the field. The Fuzzy Lookup Transform is a very expensive operation that builds specialized indexes of the input stream and the reference data for comparison intentions. Therefore, it is recommended to first use a regular Lookup Transform and ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required