Chapter 3. The Data Warehouse and Design

There are two major components to building a data warehouse: the design of the interface from operational systems and the design of the data warehouse itself. Yet, the term "design" is not entirely accurate because it suggests that elements can be planned out in advance. The requirements for the data warehouse cannot be known until it is partially populated and in use, and design approaches that have worked in the past will not necessarily suffice in subsequent data warehouses. Data warehouses are constructed in a heuristic manner, where one phase of development depends entirely on the results attained in the previous phase. First, one portion of data is populated. It is then used and scrutinized by the DSS analyst. Next, based on feedback from the end user, the data is modified and/or other data is added. Then another portion of the data warehouse is built, and so forth. This feedback loop continues throughout the entire life of the data warehouse.

Therefore, data warehouses cannot be designed the same way as the classical requirements-driven system. On the other hand, anticipating requirements is still important. Reality lies somewhere in between.

Beginning with Operational Data

Design begins with the considerations of placing data in the data warehouse. There are many considerations to be made concerning the placement of data into the data warehouse from the operational environment.

At the outset, operational transaction-oriented data is locked ...

Get Building the Data Warehouse now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.