Chapter 60. Change Data Capture

IN THIS CHAPTER

  • High-end BI ETL

  • Leveraging the T-Log

I know almost nothing about the CDC in Atlanta. The little I do know about the Centers for Disease Control comes from watching Dustin Hoffman in the movie Outbreak. Fortunately for me and you, this chapter is about the other CDC—Change Data Capture.

There's power hidden in the transaction log (T-Log), and Change Data Capture (CDC) harnesses the transaction log to capture data changes with the least possible impact on performance.

Any data written to the transaction log can be asynchronously captured using CDC from the transaction log after the transaction is complete, so it doesn't affect the original transaction's performance. CDC can track any data from the T-Log, including any DML INSERT, UPDATE, DELETE, and MERGE command, and DDL CREATE, ALTER, and DROP.

Changes are stored in change tables—tables created by CDC with the same columns as the tracked tables plus a few extra CDC-specific columns. All the changes are captured, so CDC can return all the intermediate values or just the net changes.

Because CDC gathers its data by reading the log, the data in the change tables is organized the same way the transaction log is organized—by T-log log sequence numbers, known as LSNs. (Kalen Delaney told a joke about Oracle's founder Larry Ellison being inside SQL Server—just look at the transaction log and there's LSN! Ha!)

There are only a few drawbacks to CDC:

  • Cost: It requires Enterprise Edition.

  • Code: Personally, ...

Get Microsoft® SQL Server® 2008 Bible now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.