You are previewing Dimensional Data Warehousing with MySQL: A Tutorial.
O'Reilly logo
Dimensional Data Warehousing with MySQL: A Tutorial

Book Description

Data warehousing with MySQL, a free and popular database, has never been made easier with this step-by-step tutorial on building dimensional data warehouses. Topics include star-schema modeling, populating (Extract, Transform, and Load: ETL), testing, and dimensional querying. It comes complete with a hands-on case—scaled-down from a real project—as well as an electronic copy of all MySQL scripts and sample data available for download. Computer programmers who need to build a data warehouse will find relevant examples and information written in a thorough, easy-to-follow style.

Table of Contents

  1. Copyright
  2. Introduction
    1. What This Book Is About and What It Is Not
    2. Who This Book Is For
    3. Prerequisite Skills
    4. What You Will Get from This Book
    5. Chapters Overview
    6. Code Download
  3. Fundamentals
    1. Basic Components
      1. The Star Schema
      2. Surrogate Keys
      3. Tasks
      4. Summary
    2. Dimension History
      1. Slowly Changing Dimension Techniques
      2. Slowly Changing Dimension Type 1 (SCD1)
      3. Slowly Changing Dimension Type 2 (SCD2)
      4. Summary
    3. Measure Additivity
      1. Fully-Additive Measures
      2. Testing Fully-Additivity
      3. Summary
    4. Dimensional Queries
      1. Applying Dimensional Queries
      2. Aggregate Queries
      3. Specific Queries
      4. Inside-Out Queries
      5. Summary
  4. Extract, Transform, and Load
    1. Source Extraction
      1. Whole Source or Change Data Capture (CDC)
      2. Pull by Data or Push by Source
      3. Push-by-source CDC on Sales Orders Extraction
      4. Testing
      5. Summary
    2. Populating the Date Dimension
      1. Pre-population
      2. Testing Pre-Population
      3. One-Date-Every-Day
      4. Loading Dates from the Source
      5. Summary
    3. Initial Population
      1. Source Data
      2. Initial Population Script
      3. Running the Initial Population Script
      4. Summary
    4. Regular Population
      1. Identifying Data Sources and Loading Types
      2. Regular Population Script
      3. Testing Data
      4. Summary
    5. Regular Population Scheduling
      1. Preparing the Batch File
      2. Scheduling the Batch Job
      3. Summary
  5. Growth
    1. Adding Columns
      1. Enhancing the Schema
      2. Revising Regular Population Script
      3. Testing
      4. Summary
    2. On-Demand Population
      1. Enhancing the Schema
      2. Performing On-Demand Population
      3. Testing
      4. Summary
    3. Subset Dimensions
      1. Month Roll-up Dimension
      2. Pennsylvania Customer Dimension
      3. Revising the Regular Population
      4. Testing the Revised Regular Population
      5. Summary
    4. Dimension Role Playing
      1. Adding Request Delivery Dates
      2. Table Alias Implementation
      3. Database View
      4. Summary
    5. Snapshots
      1. Periodic Snapshots
      2. Accumulating Snapshots
      3. Preparing Data for Regular Population
      4. Summary
    6. Dimension Hierarchies
      1. Hierarchies in A Data Warehouse
      2. Grouping and Drilling Queries
      3. Summary
    7. Multi-Path and Ragged Hierarchies
      1. Adding A Hierarchy
      2. Adding 2006 Data
      3. Hierarchical Queries
      4. Ragged Hierarchies
      5. Summary
    8. Degenerate Dimensions
      1. Degenerating the Order Dimension
      2. Revising the Regular Population Script
      3. Testing the Revised Regular Population Script
      4. Summary
    9. Junk Dimensions
      1. Adding the Sales Order Attribute Junk Dimension
      2. Revising the Regular Population Script
      3. Summary
    10. Multi-Star Schemas
      1. The New Star Schema
      2. Populating the New Star’s Tables
      3. Testing
      4. Summary
  6. Advanced Techniques
    1. Non-straight Sources
      1. Overview
      2. Revising the Campaign Population Script
      3. Summary
    2. Factless Facts
      1. Product Launch Factless Facts
      2. Revising the DW Regular Population Script
      3. Summary
    3. Late Arrival Facts
      1. Handling Late Arrival Facts
      2. Testing
      3. Summary
    4. Dimension Consolidation
      1. Revising the Data Warehouse Schema
      2. Testing the Zip Code Dimension Implementation
      3. Revising the production_fact Table
      4. Testing the Factory Zip Codes
      5. Revising the Regular Population Script
      6. Testing the Revised Regular Population Script
      7. Revising the Production Regular Population Script
      8. Testing the Revised Production Regular Population Script
      9. Summary
    5. Accumulated Measures
      1. The Revised Schema
      2. Initial Population
      3. Testing the Initial Population Script
      4. Regular Population
      5. Example Queries
      6. Summary
    6. Band Dimensions
      1. Annual Sales Order Star Schema
      2. Initial Population
      3. Revising the Regular Population Script
      4. Testing
      5. Summary
  7. Flat File Data Sources