You are previewing Information Governance Principles and Practices for a Big Data Landscape.
O'Reilly logo
Information Governance Principles and Practices for a Big Data Landscape

Book Description

This IBM® Redbooks® publication describes how the IBM Big Data Platform provides the integrated capabilities that are required for the adoption of Information Governance in the big data landscape.

As organizations embark on new use cases, such as Big Data Exploration, an enhanced 360 view of customers, or Data Warehouse modernization, and absorb ever growing volumes and variety of data with accelerating velocity, the principles and practices of Information Governance become ever more critical to ensure trust in data and help organizations overcome the inherent risks and achieve the wanted value.

The introduction of big data changes the information landscape. Data arrives faster than humans can react to it, and issues can quickly escalate into significant events. The variety of data now poses new privacy and security risks. The high volume of information in all places makes it harder to find where these issues, risks, and even useful information to drive new value and revenue are.

Information Governance provides an organization with a framework that can align their wanted outcomes with their strategic management principles, the people who can implement those principles, and the architecture and platform that are needed to support the big data use cases. The IBM Big Data Platform, coupled with a framework for Information Governance, provides an approach to build, manage, and gain significant value from the big data landscape.

Table of Contents

  1. Front cover
  2. Notices
    1. Trademarks
  3. Preface
    1. Authors
    2. Now you can become a published author, too!
    3. Comments welcome
    4. Stay connected to IBM Redbooks
  4. Chapter 1. Introducing big data
    1. 1.1 What big data is
      1. 1.1.1 Origins of big data
    2. 1.2 Dimensions of big data
      1. 1.2.1 How big is big data: Volume
      2. 1.2.2 Variety
      3. 1.2.3 Velocity
      4. 1.2.4 Veracity: Can data be trusted
      5. 1.2.5 Value: The key driver
    3. 1.3 What big data looks like
      1. 1.3.1 Social media
      2. 1.3.2 Web logs
      3. 1.3.3 Machine-generated data
      4. 1.3.4 GPS and spatial data
      5. 1.3.5 Streaming data
    4. 1.4 Information Governance and big data
      1. 1.4.1 Metadata management
      2. 1.4.2 Security and privacy
      3. 1.4.3 Data integration and data quality
      4. 1.4.4 Master data management
  5. Chapter 2. Information Governance foundations for big data
    1. 2.1 Evolving to Information Governance
    2. 2.2 IBM Information Governance Capability Maturity Model
      1. 2.2.1 Outcomes
      2. 2.2.2 Enablers
      3. 2.2.3 Core disciplines
      4. 2.2.4 Supporting disciplines
  6. Chapter 3. Big Data Information Governance principles
    1. 3.1 Root principle for Big Data Information Governance
    2. 3.2 Leading from principle
      1. 3.2.1 Speed versus quality
    3. 3.3 Core principles for Big Data Information Governance
    4. 3.4 Practical application examples
  7. Chapter 4. Big data use cases
    1. 4.1 Emerging big data use cases
    2. 4.2 Big data exploration
      1. 4.2.1 Identification of value
      2. 4.2.2 Enablement of Data Science teams
    3. 4.3 Enhanced 360° view of the customer
      1. 4.3.1 Expanding the range of customer-related data
      2. 4.3.2 Personalized Customer Engagements
      3. 4.3.3 Micro-market Campaign Management
      4. 4.3.4 Customer retention
      5. 4.3.5 Real-time demand forecasts
    4. 4.4 Security and Intelligence extensions
      1. 4.4.1 Enhancing traditional security through analytics
      2. 4.4.2 Network threat prediction and prevention
      3. 4.4.3 Enhanced surveillance insight
      4. 4.4.4 Crime prediction and protection
    5. 4.5 Operations analysis
      1. 4.5.1 Traffic management
      2. 4.5.2 Environmental monitoring and assessment
      3. 4.5.3 Predictive Maintenance
    6. 4.6 Data Warehouse modernization
      1. 4.6.1 Pre-processing hub
      2. 4.6.2 Queryable archive
      3. 4.6.3 Exploratory analysis
  8. Chapter 5. Big data reference architecture
    1. 5.1 Traditional information landscape
    2. 5.2 The big data information landscape
      1. 5.2.1 Capabilities for the new information landscape
  9. Chapter 6. Introduction to the IBM Big Data Platform
    1. 6.1 Components of the IBM Big Data Platform
    2. 6.2 The Data Warehouse
      1. 6.2.1 DB2
      2. 6.2.2 IBM PureData for Operational Analytics
      3. 6.2.3 IBM PureData System for Analytics
    3. 6.3 Stream computing
      1. 6.3.1 IBM InfoSphere Streams
    4. 6.4 Apache Hadoop and related big data architectures
      1. 6.4.1 IBM InfoSphere BigInsights
    5. 6.5 Information Integration and Governance
      1. 6.5.1 IBM InfoSphere Information Server
      2. 6.5.2 IBM InfoSphere Data Replication
      3. 6.5.3 IBM InfoSphere Federation Server
      4. 6.5.4 IBM InfoSphere Master Data Management
      5. 6.5.5 IBM InfoSphere Optim
      6. 6.5.6 IBM InfoSphere Guardium
    6. 6.6 Big Data Accelerators
      1. 6.6.1 IBM Accelerators for Big Data
      2. 6.6.2 IBM Industry Models
    7. 6.7 Data visualization
      1. 6.7.1 IBM InfoSphere Data Explorer
    8. 6.8 The IBM Big Data Platform and the reference architecture
  10. Chapter 7. Security and privacy
    1. 7.1 Why big data is different
    2. 7.2 Information security defined
      1. 7.2.1 Security framework
    3. 7.3 Data privacy defined
      1. 7.3.1 What sensitive data is
      2. 7.3.2 Privacy operational structure
    4. 7.4 How security and privacy intersect
      1. 7.4.1 Implications and suggestions for big data
      2. 7.4.2 Fit-for-purpose security and privacy
    5. 7.5 Big data usage and adoption phases
      1. 7.5.1 Exploration phase
      2. 7.5.2 Prepare and Govern Phase (Assess and Protect)
      3. 7.5.3 Inventorying and classifying sensitive data
      4. 7.5.4 Consumption Phase (Sustain)
    6. 7.6 Summary
  11. Chapter 8. Information Quality and big data
    1. 8.1 Information quality and information governance
    2. 8.2 Exploring big data content
      1. 8.2.1 Knowing your data
      2. 8.2.2 Call detail records
      3. 8.2.3 Sensor data
      4. 8.2.4 Machine data
      5. 8.2.5 Social media data
    3. 8.3 Understanding big data
      1. 8.3.1 Big Data Exploration
    4. 8.4 Standardizing, measuring, and monitoring quality in big data
      1. 8.4.1 Fit for purpose
      2. 8.4.2 Techniques for Information Quality Management
      3. 8.4.3 Governance and trust in big data
  12. Chapter 9. Enhanced 360° view of the customer
    1. 9.1 Master data management: An overview
      1. 9.1.1 Getting a handle on enterprise master data
      2. 9.1.2 InfoSphere Data Explorer and MDM
    2. 9.2 Governing master data in a big data environment
  13. Related publications
    1. IBM Redbooks
    2. Other publications
    3. Online resources
    4. Help from IBM
  14. Back cover