IBM Information Server: Integration and Governance for Emerging Data Warehouse Demands

Book description

This IBM® Redbooks® publication is intended for business leaders and IT architects who are responsible for building and extending their data warehouse and Business Intelligence infrastructure. It provides an overview of powerful new capabilities of Information Server in the areas of big data, statistical models, data governance and data quality. The book also provides key technical details that IT professionals can use in solution planning, design, and implementation.

Table of contents

  1. Front cover
  2. Notices
    1. Trademarks
  3. Preface
    1. Authors
    2. Now you can become a published author, too!
    3. Comments welcome
    4. Stay connected to IBM Redbooks
  4. Part 1 Overview and concepts
  5. Chapter 1. Overview of IBM InfoSphere Information Server
    1. 1.1 Packaged Editions
    2. 1.2 Information Server Components
      1. 1.2.1 InfoSphere Blueprint Director
      2. 1.2.2 InfoSphere Discovery
      3. 1.2.3 InfoSphere Metadata Workbench
      4. 1.2.4 InfoSphere Data Architect and IBM Industry Data Models
      5. 1.2.5 InfoSphere Business Glossary
      6. 1.2.6 InfoSphere QualityStage
      7. 1.2.7 InfoSphere Information Analyzer
      8. 1.2.8 InfoSphere Data Quality Console
      9. 1.2.9 InfoSphere Information Services Director
      10. 1.2.10 InfoSphere FastTrack
      11. 1.2.11 InfoSphere DataStage
      12. 1.2.12 InfoSphere DataStage Balanced Optimization
      13. 1.2.13 InfoSphere Change Data Delivery
      14. 1.2.14 InfoSphere Data Click
  6. Chapter 2. Using Information Server to design and implement a Data Warehouse
    1. 2.1 How the capabilities fit together
    2. 2.2 Method and proven practices: Business-driven BI development
    3. 2.3 Phases
    4. 2.4 Information Server components by Phase
      1. 2.4.1 Plan
      2. 2.4.2 Discover
      3. 2.4.3 Analyze
      4. 2.4.4 Define
      5. 2.4.5 Develop
      6. 2.4.6 Deploy
  7. Part 2 Meeting the increasing demands of workloads, users, and the business
  8. Chapter 3. Data Click: Self-Service Data Integration
    1. 3.1 Motivation and overview
      1. 3.1.1 Benefits of Data Click over traditional approaches
      2. 3.1.2 Data Click details
    2. 3.2 The two-click experience for a self-service user
      1. 3.2.1 Running and feedback
      2. 3.2.2 Advanced user configuration
    3. 3.3 Summary and more resources
  9. Chapter 4. Incorporating new sources: Hadoop and big data
    1. 4.1 Big Data File Stage
    2. 4.2 Balanced Optimization
    3. 4.3 Balanced Optimization for Hadoop
      1. 4.3.1 Complete pushdown optimization
      2. 4.3.2 Hybrid pushdown optimization
    4. 4.4 IBM InfoSphere Streams Integration
    5. 4.5 Oozie Workflow Activity stage
    6. 4.6 Unlocking big data
  10. Chapter 5. SPSS: Incorporating Analytical Models into your warehouse environment
    1. 5.1 Analytics background
    2. 5.2 Motivating examples
      1. 5.2.1 A banking example
      2. 5.2.2 A telecom example
      3. 5.2.3 A customer care example
    3. 5.3 End-to-end flow
      1. 5.3.1 Model building by using IBM SPSS Modeler
      2. 5.3.2 Model scoring within IBM InfoSphere DataStage
    4. 5.4 Integrating IBM SPSS Models with external applications
      1. 5.4.1 Publishing a Stream
      2. 5.4.2 Running a Stream
    5. 5.5 Building SPSS Stage in IBM InfoSphere DataStage
      1. 5.5.1 Extending IBM InfoSphere DataStage
      2. 5.5.2 Other features of SPSS stage
    6. 5.6 Summary
  11. Chapter 6. Governance of data warehouse information
    1. 6.1 Information and expectations
      1. 6.1.1 Business drivers
      2. 6.1.2 Using the information
    2. 6.2 Information Governance: The Maturity Model
      1. 6.2.1 Elements of an Information Governance Maturity Model
      2. 6.2.2 Business terms: The language of the business
    3. 6.3 Business terms: Enablers of awareness and communication
      1. 6.3.1 Sources of business terms
      2. 6.3.2 Standard practices in glossary development and deployment
      3. 6.3.3 Examples of glossary categories and terms
    4. 6.4 Information Governance policies and rules
      1. 6.4.1 Definition and management of information policies
      2. 6.4.2 Definition and management of Information Governance rules
      3. 6.4.3 Standard Practices in Information Governance policy and rule development
    5. 6.5 Information stewardship
    6. 6.6 Information Governance for the data warehouse
    7. 6.7 Conclusion
  12. Chapter 7. Establishing trust by ensuring quality
    1. 7.1 Moving to trusted information
      1. 7.1.1 Challenges to trusted information
      2. 7.1.2 Impact of information issues
    2. 7.2 Mission of information quality
      1. 7.2.1 Key information quality steps
    3. 7.3 Understanding information quality
      1. 7.3.1 Data Quality Assessment
      2. 7.3.2 Expanding on the initial assessment
    4. 7.4 Validating data with rules for information quality
      1. 7.4.1 Incorporating business value and objectives
      2. 7.4.2 Defining the primary requirements
      3. 7.4.3 Designing the data rules
      4. 7.4.4 Example of data rule analysis
      5. 7.4.5 Setting priorities and refining conditions
      6. 7.4.6 Types of Data Rules
      7. 7.4.7 Examples of rules
      8. 7.4.8 Considerations in Data Rule design
      9. 7.4.9 Breaking requirements into building blocks for Data Rules
      10. 7.4.10 Evaluating Data Rule results
    5. 7.5 Measuring and monitoring information quality
      1. 7.5.1 Establishing priorities
      2. 7.5.2 Setting objectives and benchmarks
      3. 7.5.3 Developing the monitoring process
      4. 7.5.4 Implementing the monitoring process
    6. 7.6 Information Quality Management
      1. 7.6.1 Lifecycle and deploying Data Rules
      2. 7.6.2 Publishing data rules for reuse
      3. 7.6.3 Deploying data rules to production
      4. 7.6.4 Running Data Rules in production
      5. 7.6.5 Delivering and managing data quality results
      6. 7.6.6 Developing Information Quality reports
      7. 7.6.7 Retaining and archiving old results
      8. 7.6.8 Improving ongoing processes
    7. 7.7 Conclusion
  13. Chapter 8. Data standardization and matching
    1. 8.1 Use cases
      1. 8.1.1 Conditioning and standardization
      2. 8.1.2 Address verification
      3. 8.1.3 Matching and de-duplication
      4. 8.1.4 Consolidation and enrichment
      5. 8.1.5 Summary
  14. Related publications
    1. IBM Redbooks
    2. Other publications
    3. Online resources
    4. Help from IBM
  15. Back cover

Product information

  • Title: IBM Information Server: Integration and Governance for Emerging Data Warehouse Demands
  • Author(s): Chuck Ballard, Manish Bhide, Holger Kache, Bob Kitzberger, Beate Porst, Yeh-Heng Sheng, Harald C. Smith
  • Release date: July 2013
  • Publisher(s): IBM Redbooks
  • ISBN: None