Chapter 6. Master Data Management

"The truth will set you free, but first it will make you miserable."

— Attributed to James A. Garfield

If you are reading this book from start to finish, you should now have the target dimensional model built in the relational database and be ready to start working on the ETL system. One of the big issues you may have struggled with in designing the dimensional model was figuring out where all the attributes in a given conformed dimension were going to come from. In many organizations, there are multiple sources for the same attribute, and multiple versions of the same entity. In addition, you probably identified attributes the business users need in the dimension that only exist in an Excel spreadsheet.

If this describes your source environment, you would do well to deal with these data integration issues before you pull the dimensions into the warehouse. In the broader IT world, this is called master data management. Master data is reference data that is managed centrally for an organization. Master data describes the business entities that participate in the transaction systems. These business entities include people such as customers and employees; places such as warehouses, sales offices, and manufacturing plants; physical entities such as products and assets; and logical entities such as organizational structures and charts of accounts.

Master data sounds a lot like dimensions. Dimension data and master data are closely related, but they are ...

Get The Microsoft® Data Warehouse Toolkit: With SQL Server 2008 R2 and the Microsoft® Business Intelligence Toolset, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.