4.5. Creating a Custom Slowly Changing Package

After working with a couple of clients, it became obvious that the built-in SCD transformation would handle most, but not all, dimension processing situations, and its limitations have been pointed out earlier.

The question becomes: "How can we build a dimension ETL package that mimics the functionality of the built-in SCD support, but scales to handle high volumes?"

If you are now comfortable with the SSIS data flow features and the SCD concepts covered thus far, then the answer will be surprisingly easily. Before laying out a couple of variations, consider the different aspects of the dimension process:

  • Data correlation between the source records and the dimension table

  • Comparisons between the column values from the source and the dimension tables

  • Updates to handle Type 1 changing records and inferred members

  • Inserts to handle Type 2 historical changes and new dimension members

When considering the built-in SCD support, the SCD transformation (not including the downstream transformations) handles the first two aspects listed here. So, our objective first is to re-create this process by using out-of-the-box features.

4.5.1. Joining Source Data to Dimension Data

The first thing you want to do is correlate the data, comparing the source to the dimension. Within SSIS, there are two primary transformations that can help you correlate data: the Lookup transformation and the Merge Join transformation.

In both cases, you must use the business ...

Get Expert SQL Server™ 2005 Integration Services now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.