Chapter 11

Data Extraction

Abstract

This chapter starts with a brief review of the staging area purpose. It then explains the use of hash functions in data warehousing in detail and how they are applied to data, including a discussion of their risks. The purpose and use of load dates and record sources are also explained. The authors demonstrate how to build the stage area (the stage layer) of the data warehouse system and discuss the use of data types and common attributes. Data for the data warehouse is sourced from operational systems, either by loading the data directly from operational databases or from flat files. The chapter shows both options and provides some best practices for dealing with both cases. It also demonstrates how to source ...

Get Building a Scalable Data Warehouse with Data Vault 2.0 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.