3.2. Incremental Data Extraction

Incremental extraction is the process of targeting modified records and new records, and then tracking the last completed extraction. For example, as you process rows, you might track the last modified datetime value or last identifier for the rows that you've pulled. Then, the next time you extract from the same source system, you start where you left off, rather than extracting the entire process batch.

How you do incremental extraction depends on the source system design. In fact, certain source systems do not initially provide a mechanism to target changes and new rows. In these cases, sometimes you have the flexibility to make changes on the source system, sometimes you don't. SSIS can handle both scenarios.

Here are a few of the many ways that you may be able to identify changed source records:

  • Use a modified date or created date column from a database source—These are called change identifier columns. In fact, many transactional systems already have change identifier columns that can be used for an incremental extraction. This is probably the most common approach to incremental extraction.

  • Use an auto-incrementing identifier—If the source system doesn't have a modified date or created date, there may be an auto-incrementing column acting as a change identifier, which increases every time a row change happens. You can also use this approach to identify changed records.

  • Use an audit table—Some source systems already have or allow a trigger (or ...

Get Expert SQL Server™ 2005 Integration Services now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.