5.6. Summary

The goal of this chapter is to encourage you to plan your ETL system. To develop a good plan, you need a basic understanding of how an ETL system is built in SQL Server 2005. The main ETL tool is Integration Services, which contains a rich set of functionality for developing and executing an ETL system. This chapter introduces you to Integration Services' features, without going into any details about how exactly to use them.

The first step in developing your ETL system specification is to start from the draft source to target mappings that are created during the modeling process. While you're developing your data model, you must be looking at the source systems in order to validate that your data model is feasible. At this point you haven't done detailed examination of the source systems, but you've identified source tables for your dimensional model, and you've identified some data quality issues. Document these relationships and issues in a high-level map. You'll edit this map throughout the specification process in order to correct your early assumptions.

The next big step is to profile the source system data. We recommend that you purchase a tool to perform data profiling, as it's very tedious to do by hand. The data profiling results are the main set of information that you need to complete the source-to-target mapping on a detailed level, table by table and column by column.

There are several system-wide decisions to make and document: How often will you load ...

Get The Microsoft® Data Warehouse Toolkit: With SQL Server™ 2005 and the Microsoft® Business Intelligence Toolset now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.