5.4. Real-Time Loads

A growing number of data warehouses are loaded in real time. Instead of updating the fact and dimension tables through periodic batch processes, a real-time warehouse is updated as new transactions occur on the source system. Because of the additional complexity of this style of load process, real-time loads are reserved for situations where there is a clear business benefit to having up-to-the-minute data in warehouse tables. When base tables are updated in real time, the load process for aggregate tables is affected. Effectively, aggregates must be loaded simultaneously with base tables.

5.4.1. Real-Time Load of the Base Schema

A real-time load of the base schema applies changes and inserts records into the schema as a series of transactions, rather than a batch process. In theory, a single transaction could update operational and warehouse tables simultaneously. However, this could affect the performance or availability of the operational system. Instead, transactions that insert or update the source system are logged, or the base tables are fit with triggers that fire as records are added or changed. These mechanisms in turn feed into a processthat has been designed to load the transaction into the warehouse tables.

The result is that the warehouse load process receives transactions toprocess shortly after they have been applied to the source system. Each transaction must be processed according to the same set of requirements identified earlier in this ...

Get Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.