8

Fact Table ETL

When it comes to loading data into data warehouses, dimension table ETL is only half the story. The next major aspect of ETL involves fact tables. But the good news is that dimension ETL usually represents the majority of the complexity in the overall warehouse ETL. To be sure, a data warehouse involves more than just dimension and fact ETL (such as data lineage, auditing, and execution precedence). But in terms of business data, dimensions and facts contain the core information.

This chapter focuses on using SSIS to solve the ETL challenges involved with fact table ETL, and addresses the following:

  • The “Problem” section outlines what fact tables are, and the challenge with loading data into them.
  • The “Design” section considers the theory behind fact table ETL, including data mapping, workflow, and precedence. Included in the “Design” section is a discussion of how to identify dimension surrogate keys, measure calculations, fact table updates, and inserts.
  • The “Solution” section applies the SSIS features and demonstrates two complete packages to handle fact table ETL.

Problem

The core of data warehousing is measuring data. In Chapter 7, you learned about the descriptive data (mostly text columns) and entities called dimension tables that allowed easy browsing for attributes and hierarchies. This chapter focuses on the measuring of data (mostly numeric data) by working with entities called fact tables that contain numeric measures used to track the core business ...

Get Microsoft® SQL Server® 2008 Integration Services: Problem-Design-Solution now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.