Leveraging DB2 10 for High Performance of Your Data Warehouse

Book description

Building on the business intelligence (BI) framework and capabilities that are outlined in InfoSphere Warehouse: A Robust Infrastructure for Business Intelligence, SG24-7813, this IBM® Redbooks® publication focuses on the new business insight challenges that have arisen in the last few years and the new technologies in IBM DB2® 10 for Linux, UNIX, and Windows that provide powerful analytic capabilities to meet those challenges.

This book is organized in to two parts. The first part provides an overview of data warehouse infrastructure and DB2 Warehouse, and outlines the planning and design process for building your data warehouse. The second part covers the major technologies that are available in DB2 10 for Linux, UNIX, and Windows. We focus on functions that help you get the most value and performance from your data warehouse. These technologies include database partitioning, intrapartition parallelism, compression, multidimensional clustering, range (table) partitioning, data movement utilities, database monitoring interfaces, infrastructures for high availability, DB2 workload management, data mining, and relational OLAP capabilities. A chapter on BLU Acceleration gives you all of the details about this exciting DB2 10.5 innovation that simplifies and speeds up reporting and analytics. Easy to set up and self-optimizing, BLU Acceleration eliminates the need for indexes, aggregates, or time-consuming database tuning to achieve top performance and storage efficiency. No SQL or schema changes are required to take advantage of this breakthrough technology.

This book is primarily intended for use by IBM employees, IBM clients, and IBM Business Partners.

Table of contents

  1. Front cover
  2. Notices
    1. Trademarks
  3. Preface
    1. Authors
    2. Now you can become a published author, too!
    3. Comments welcome
    4. Stay connected to IBM Redbooks
  4. Part 1 Overview
  5. Chapter 1. Gaining business insight with IBM DB2
    1. 1.1 Current business challenges
    2. 1.2 Big data and the data warehouse
      1. 1.2.1 Data warehouse infrastructure
    3. 1.3 High performance warehouse with DB2 for Linux, UNIX, and Windows
  6. Chapter 2. Technical overview of IBM DB2 Warehouse
    1. 2.1 DB2 Warehouse solutions
      1. 2.1.1 The component groups
      2. 2.1.2 Editions
    2. 2.2 Expert systems
      1. 2.2.1 PureData for operational analytics
      2. 2.2.2 DB2 Warehouse components
      3. 2.2.3 Database management system
    3. 2.3 DB2 Warehouse topology
  7. Chapter 3. Warehouse development lifecycle
    1. 3.1 Defining business requirements
    2. 3.2 Building the data model
      1. 3.2.1 Defining the physical data model
      2. 3.2.2 Creating the physical data model
      3. 3.2.3 Working with diagrams
      4. 3.2.4 Using the Diagram Editor
      5. 3.2.5 Optimal attributes of the physical data model
      6. 3.2.6 Model analysis
      7. 3.2.7 Deploying the data model
      8. 3.2.8 Maintaining model accuracy
    3. 3.3 Matching the technology to the requirements
  8. Part 2 Technologies
  9. Chapter 4. Column-organized data store with BLU Acceleration
    1. 4.1 Simple and fast with BLU Acceleration
      1. 4.1.1 Dynamic in-memory columnar technology
      2. 4.1.2 Actionable compression
      3. 4.1.3 Parallel vector processing
      4. 4.1.4 Data skipping
    2. 4.2 Getting started with BLU Acceleration
      1. 4.2.1 Capacity planning
      2. 4.2.2 Storage requirements
    3. 4.3 Creating a column-organized data store
      1. 4.3.1 Single configuration setting for analytic workloads
      2. 4.3.2 Fine-tuning a configuration
      3. 4.3.3 Creating column-organized tables
      4. 4.3.4 Converting to column-organized tables
    4. 4.4 Loading a column-organized data store
      1. 4.4.1 Synopsis tables and data skipping
      2. 4.4.2 Compression of column-organized tables
      3. 4.4.3 Reducing load times
    5. 4.5 Managing a column-organized data mart
      1. 4.5.1 Query execution plans and the new CTQ operator
      2. 4.5.2 Database maintenance
  10. Chapter 5. Row-based data store
    1. 5.1 Scale-out solution: Database Partitioning Feature
      1. 5.1.1 Partition
      2. 5.1.2 Partition group
      3. 5.1.3 Table spaces in a DPF environment
      4. 5.1.4 Partitioning keys
      5. 5.1.5 Partition maps
      6. 5.1.6 Choosing partitioning keys
      7. 5.1.7 Concluding remarks
    2. 5.2 Scale-up solution: Intrapartition parallelism
    3. 5.3 Other features for row-based data warehouse
      1. 5.3.1 Compression
      2. 5.3.2 Multidimensional data clustering
      3. 5.3.3 Range partitioning
  11. Chapter 6. Data movement and transformation
    1. 6.1 Introduction
      1. 6.1.1 Load
      2. 6.1.2 Ingest
      3. 6.1.3 Continuous data ingest
    2. 6.2 DB2 Warehouse SQL Warehousing Tool
      1. 6.2.1 Architecture
      2. 6.2.2 Development environment
      3. 6.2.3 Moving from development to production
      4. 6.2.4 Runtime environment
  12. Chapter 7. Monitoring
    1. 7.1 Understanding monitor elements
      1. 7.1.1 Monitor element collection levels
      2. 7.1.2 New monitoring elements for column-organized tables
    2. 7.2 Using DB2 table functions to monitor your database in real time
      1. 7.2.1 Monitoring requests
      2. 7.2.2 Monitoring activities
      3. 7.2.3 Monitoring data objects
      4. 7.2.4 Monitoring locks
      5. 7.2.5 Monitoring system memory
      6. 7.2.6 Monitoring routines
    3. 7.3 Using event monitors to capture information about database events
    4. 7.4 Monitoring your DB2 system performance with IBM InfoSphere Optim Performance Manager
      1. 7.4.1 Information dashboards
      2. 7.4.2 OPM support for DB2 10.5
  13. Chapter 8. High availability
    1. 8.1 IBM PureData System for Operational Analytics
    2. 8.2 Core warehouse availability
      1. 8.2.1 Roving HA group configuration for the administration hosts
      2. 8.2.2 Roving HA group configuration for the data hosts
      3. 8.2.3 Core warehouse HA events monitored by the system console
    3. 8.3 Management host availability
      1. 8.3.1 Management host failover events that are monitored by the system console
    4. 8.4 High availability management
      1. 8.4.1 High availability toolkit
      2. 8.4.2 Starting and stopping resources with the HA toolkit
      3. 8.4.3 Monitoring the status of the core warehouse HA configuration
      4. 8.4.4 Monitoring the status of the management host HA configuration
      5. 8.4.5 Moving database partition resources to the standby node as a planned failover
      6. 8.4.6 Moving resources to the standby management host as a planned failover
  14. Chapter 9. Workload management
    1. 9.1 Introducing DB2 workload management
      1. 9.1.1 Defining business goals
      2. 9.1.2 Identifying work that enters the data server
      3. 9.1.3 Managing work in progress
      4. 9.1.4 Monitoring to ensure that the data server is being used efficiently
    2. 9.2 Workload management administrator authority
    3. 9.3 Workload management concepts
      1. 9.3.1 Workloads
      2. 9.3.2 Work classes and work class sets
      3. 9.3.3 Service classes
      4. 9.3.4 Thresholds
      5. 9.3.5 Threshold actions
      6. 9.3.6 Work actions and work action sets
      7. 9.3.7 Histograms and histogram templates
    4. 9.4 Configuring DB2 workload management in stages
      1. 9.4.1 Default workload management (Stage 0)
      2. 9.4.2 Untuned workload management (Stage 1)
      3. 9.4.3 Tuned workload management (Stage 2)
      4. 9.4.4 Advanced workload management (Stage 3)
      5. 9.4.5 Moving from Stage 0 to Stage 2
    5. 9.5 Workload management dispatcher
    6. 9.6 Default query concurrency management
      1. 9.6.1 Default workload management objects for concurrency control
  15. Chapter 10. Mining and unstructured text analytics
    1. 10.1 Overview
    2. 10.2 Business goals
    3. 10.3 Business scenarios
      1. 10.3.1 In data mining
      2. 10.3.2 In text analytics and unstructured analysis
    4. 10.4 Data mining techniques and process
      1. 10.4.1 Data mining techniques
      2. 10.4.2 Data mining process: preparing, building, and deploying
    5. 10.5 Data mining and text analytics in DB2 Warehouse
      1. 10.5.1 Data mining
      2. 10.5.2 Unstructured analysis
  16. Chapter 11. Providing the analytics
    1. 11.1 Implementing OLAP
      1. 11.1.1 Dimensional model
      2. 11.1.2 Providing OLAP data
      3. 11.1.3 Using OLAP data
      4. 11.1.4 Pulling it all together
    2. 11.2 Cognos and the cube model
      1. 11.2.1 Cognos architecture
      2. 11.2.2 Cognos metadata and Cognos Cube Designer
      3. 11.2.3 Dimensional metadata and dynamic cubes
    3. 11.3 Dynamic Cubes architecture and its lifecycle
      1. 11.3.1 Design and model phase
      2. 11.3.2 Deploy phase
      3. 11.3.3 Run phase
      4. 11.3.4 Optimization
      5. 11.3.5 Cognos Dynamic Cubes caching
      6. 11.3.6 Cognos Dynamic Cubes and the data warehouse
      7. 11.3.7 Cognos Dynamic Cubes and DB2 with BLU Acceleration
    4. 11.4 Resources
  17. Related publications
    1. IBM Redbooks
    2. Other publications
    3. Online resources
    4. Help from IBM
  18. Back cover

Product information

  • Title: Leveraging DB2 10 for High Performance of Your Data Warehouse
  • Author(s):
  • Release date: January 2014
  • Publisher(s): IBM Redbooks
  • ISBN: 9780738438979