Chapter 5. Designing the ETL System

Measure twice; cut once

Some people like to plan, specify, and document systems; most don't. We've observed that Extract, Transformation, and Load (ETL) system development draws folks in the latter category. We've found so few people who write adequate design specifications for their ETL systems that we've practically stopped asking to see our clients' planning documents. Either they're unaware of the impending complexity, or they don't have the planning tools. We intend not to let this happen to you.

In this chapter we begin by providing an overview of SQL Server Integration Services, including a description of the processing and transformation tasks that are available. The purpose of this introductory material is to provide you with enough background about how Integration Services works so you can make informed decisions about the design of your ETL system. The next chapter, "Developing the ETL System," provides more detailed guidance on how to use Integration Services.

The second half of this chapter describes how to develop the specification for your ETL system. Begin with high-level planning; make some system-wide decisions about how to approach various issues; and then perform a thorough analysis of the source data. We'll talk about how to set up the ETL system and staging areas. Finally, you will have gathered enough information and made enough decisions to develop the detailed ETL system specification. Throughout, we will be referring ...

Get The Microsoft® Data Warehouse Toolkit: With SQL Server™ 2005 and the Microsoft® Business Intelligence Toolset now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.