You are previewing Data Virtualization for Business Intelligence Systems.
O'Reilly logo
Data Virtualization for Business Intelligence Systems

Book Description

Data virtualization can help you accomplish your goals with more flexibility and agility. Learn what it is and how and why it should be used with Data Virtualization for Business Intelligence Systems. In this book, expert author Rick van der Lans explains how data virtualization servers work, what techniques to use to optimize access to various data sources and how these products can be applied in different projects. You’ll learn the difference is between this new form of data integration and older forms, such as ETL and replication, and gain a clear understanding of how data virtualization really works. Data Virtualization for Business Intelligence Systems outlines the advantages and disadvantages of data virtualization and illustrates how data virtualization should be applied in data warehouse environments. You’ll come away with a comprehensive understanding of how data virtualization will make data warehouse environments more flexible and how it make developing operational BI applications easier. Van der Lans also describes the relationship between data virtualization and related topics, such as master data management, governance, and information management, so you come away with a big-picture understanding as well as all the practical know-how you need to virtualize your data.



  • First independent book on data virtualization that explains in a product-independent way how data virtualization technology works.
  • Illustrates concepts using examples developed with commercially available products.
  • Shows you how to solve common data integration challenges such as data quality, system interference, and overall performance by following practical guidelines on using data virtualization.
  • Apply data virtualization right away with three chapters full of practical implementation guidance.
  • Understand the big picture of data virtualization and its relationship with data governance and information management.

Table of Contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Dedication
  6. Foreword
  7. Preface
  8. About the Author
  9. Chapter 1. Introduction to Data Virtualization
    1. 1.1 Introduction
    2. 1.2 The World of Business Intelligence Is Changing
    3. 1.3 Introduction to Virtualization
    4. 1.4 What Is Data Virtualization?
    5. 1.5 Data Virtualization and Related Concepts
    6. 1.6 Definition of Data Virtualization
    7. 1.7 Technical Advantages of Data Virtualization
    8. 1.8 Different Implementations of Data Virtualization
    9. 1.9 Overview of Data Virtualization Servers
    10. 1.10 Open versus Closed Data Virtualization Servers
    11. 1.11 Other Forms of Data Integration
    12. 1.12 The Modules of a Data Virtualization Server
    13. 1.13 The History of Data Virtualization
    14. 1.14 The Sample Database: World Class Movies
    15. 1.15 Structure of This Book
  10. Chapter 2. Business Intelligence and Data Warehousing
    1. 2.1 Introduction
    2. 2.2 What Is Business Intelligence?
    3. 2.3 Management Levels and Decision Making
    4. 2.4 Business Intelligence Systems
    5. 2.5 The Data Stores of a Business Intelligence System
    6. 2.6 Normalized Schemas, Star Schemas, and Snowflake Schemas
    7. 2.7 Data Transformation with Extract Transform Load, Extract Load Transform, and Replication
    8. 2.8 Overview of Business Intelligence Architectures
    9. 2.9 New Forms of Reporting and Analytics
    10. 2.10 Disadvantages of Classic Business Intelligence Systems
    11. 2.11 Summary
  11. Chapter 3. Data Virtualization Server: The Building Blocks
    1. 3.1 Introduction
    2. 3.2 The High-Level Architecture of a Data Virtualization Server
    3. 3.3 Importing Source Tables and Defining Wrappers
    4. 3.4 Defining Virtual Tables and Mappings
    5. 3.5 Examples of Virtual Tables and Mappings
    6. 3.6 Virtual Tables and Data Modeling
    7. 3.7 Nesting Virtual Tables and Shared Specifications
    8. 3.8 Importing Nonrelational Data
    9. 3.9 Publishing Virtual Tables
    10. 3.10 The Internal Data Model
    11. 3.11 Updatable Virtual Tables and Transaction Management
  12. Chapter 4. Data Virtualization Server: Management and Security
    1. 4.1 Introduction
    2. 4.2 Impact and Lineage Analysis
    3. 4.3 Synchronization of Source Tables, Wrapper Tables, and Virtual Tables
    4. 4.4 Security of Data: Authentication and Authorization
    5. 4.5 Monitoring, Management, and Administration
  13. Chapter 5. Data Virtualization Server: Caching of Virtual Tables
    1. 5.1 Introduction
    2. 5.2 The Cache of a Virtual Table
    3. 5.3 When to Use Caching
    4. 5.4 Caches versus Data Marts
    5. 5.5 Where Is the Cache Kept?
    6. 5.6 Refreshing Caches
    7. 5.7 Full Refreshing, Incremental Refreshing, and Live Refreshing
    8. 5.8 Online Refreshing and Offline Refreshing
    9. 5.9 Cache Replication
  14. Chapter 6. Data Virtualization Server: Query Optimization Techniques
    1. 6.1 Introduction
    2. 6.2 A Refresher Course on Query Optimization
    3. 6.3 The Ten Stages of Query Processing by a Data Virtualization Server
    4. 6.4 The Intelligence Level of the Data Stores
    5. 6.5 Optimization through Query Substitution
    6. 6.6 Optimization through Pushdown
    7. 6.7 Optimization through Query Expansion (Query Injection)
    8. 6.8 Optimization through Ship Joins
    9. 6.9 Optimization through Sort-Merge Joins
    10. 6.10 Optimization by Caching
    11. 6.11 Optimization and Statistical Data
    12. 6.12 Optimization through Hints
    13. 6.13 Optimization through SQL Override
    14. 6.14 Explaining the Processing Strategy
  15. Chapter 7. Deploying Data Virtualization in Business Intelligence Systems
    1. 7.1 Introduction
    2. 7.2 A Business Intelligence System Based on Data Virtualization
    3. 7.3 Advantages of Deploying Data Virtualization
    4. 7.4 Disadvantages of Deploying Data Virtualization
    5. 7.5 Strategies for Adopting Data Virtualization
    6. 7.6 Application Areas of Data Virtualization
    7. 7.7 Myths on Data Virtualization
  16. Chapter 8. Design Guidelines for Data Virtualization
    1. 8.1 Introduction
    2. 8.2 Incorrect Data and Data Quality
    3. 8.3 Complex and Irregular Data Structures
    4. 8.4 Implementing Transformations in Wrappers or Mappings
    5. 8.5 Analyzing Incorrect Data
    6. 8.6 Different Users and Different Definitions
    7. 8.7 Time Inconsistency of Data
    8. 8.8 Data Stores and Data Transmission
    9. 8.9 Retrieving Data from Production Systems
    10. 8.10 Joining Historical and Operational Data
    11. 8.11 Dealing with Organizational Changes
    12. 8.12 Archiving Data
  17. Chapter 9. Data Virtualization and Service-Oriented Architecture
    1. 9.1 Introduction
    2. 9.2 Service-Oriented Architectures in a Nutshell
    3. 9.3 Basic Services, Composite Services, Business Process Services, and Data Services
    4. 9.4 Developing Data Services with a Data Virtualization Server
    5. 9.5 Developing Composite Services with a Data Virtualization Server
    6. 9.6 Services and the Internal Data Model
  18. Chapter 10. Data Virtualization and Master Data Management
    1. 10.1 Introduction
    2. 10.2 Data Is a Critical Asset for Every Organization
    3. 10.3 The Need for a 360-Degree View of Business Objects
    4. 10.4 What Is Master Data?
    5. 10.5 What Is Master Data Management?
    6. 10.6 A Master Data Management System
    7. 10.7 Master Data Management for Integrating Data
    8. 10.8 Integrating Master Data Management and Data Virtualization
  19. Chapter 11. Data Virtualization, Information Management, and Data Governance
    1. 11.1 Introduction
    2. 11.2 Impact of Data Virtualization on Information Modeling and Database Design
    3. 11.3 Impact of Data Virtualization on Data Profiling
    4. 11.4 Impact of Data Virtualization on Data Cleansing
    5. 11.5 Impact of Data Virtualization on Data Governance
  20. Chapter 12. The Data Delivery Platform—A New Architecture for Business Intelligence Systems
    1. 12.1 Introduction
    2. 12.2 The Data Delivery Platform in a Nutshell
    3. 12.3 The Definition of the Data Delivery Platform
    4. 12.4 The Data Delivery Platform and Other Business Intelligence Architectures
    5. 12.5 The Requirements of the Data Delivery Platform
    6. 12.6 The Data Delivery Platform versus Data Virtualization
    7. 12.7 Explanation of the Name
    8. 12.8 A Personal Note
  21. Chapter 13. The Future of Data Virtualization
    1. 13.1 Introduction
    2. 13.2 The Future of Data Virtualization According to Rick F. van der Lans
    3. 13.3 The Future of Data Virtualization According to David Besemer, CTO of Composite Software
    4. 13.4 The Future of Data Virtualization According to Alberto Pan, CTO of Denodo Technologies
    5. 13.5 The Future of Data Virtualization According to James Markarian, CTO of Informatica Corporation
  22. Bibliography
  23. Index