Chapter 5

The Data Flow

WHAT’S IN THIS CHAPTER?

  • Learn about the SSIS Data Flow architecture
  • Reading data out of sources
  • Loading data into destinations
  • Transforming data with common transformations

In the last two chapters you were introduced to the Control Flow tab through tasks and containers. In this chapter, you’ll continue along those lines with an exploration of the Data Flow tab, which is where you will spend most of your time as an SSIS developer. The Data Flow Task is where the bulk of your data heavy lifting occurs in SSIS. This chapter walks you through the transformations in the Data Flow Task, demonstrating how they can help you move and clean your data. You’ll notice a few components (the CDC ones) aren’t covered in this chapter. Those needed more coverage than this chapter had room for and are covered in Chapter 12.

UNDERSTANDING THE DATA FLOW

The SSIS Data Flow is implemented as a logical pipeline, where data flows from one or more sources, through whatever transformations are needed to cleanse and reshape it for its new purpose, and into one or more destinations. The Data Flow does its work primarily in memory, which gives SSIS its strength, allowing the Data Flow to perform faster than any ELT type environment (in most cases) where the data is first loaded into a staging environment and then cleansed with SQL statement.

One of the toughest concepts to understand for a new SSIS developer is the difference between the Control Flow and the Data Flow tabs. Chapter ...

Get Professional Microsoft SQL Server 2012 Integration Services now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.