You are previewing Data Strategy.
O'Reilly logo
Data Strategy

Book Description

The definitive best-practices guide to enterprise data-management strategy.

You can no longer manage enterprise data "piecemeal." To maximize the business value of your data assets, you must define a coherent, enterprise-wide data strategy that reflects all the ways you capture, store, manage, and use information.

In this book, three renowned data management experts walk you through creating the optimal data strategy for your organization. Using their proven techniques, you can reduce hardware and maintenance costs, and rein in out-of-control data spending. You can build new systems with less risk, higher quality, and improve data access. Best of all, you can learn how to integrate new applications that support your key business objectives.

Drawing on real enterprise case studies and proven best practices, the author team covers everything from goal-setting through managing security and performance. You'll learn how to:

  • Identify the real risks and bottlenecks you face in delivering data—and the right solutions

  • Integrate enterprise data and improve its quality, so it can be used more widely and effectively

  • Systematically secure enterprise data and protect customer privacy

  • Model data more effectively and take full advantage of metadata

  • Choose the DBMS and data storage products that fit best into your overall plan

  • Smoothly accommodate new Business Intelligence (BI) and unstructured data applications

  • Improve the performance of your enterprise database applications

  • Revamp your organization to streamline day-to-day data management and reduce cost

  • Data Strategy is indispensable for everyone who needs to manage enterprise data more efficiently—from database architects to DBAs, technical staff to senior IT decision-makers.

© Copyright Pearson Education. All rights reserved.

Table of Contents

  1. Copyright
    1. Dedication
  2. Acknowledgments
  3. About the Authors
  4. Foreword
  5. 1. Introduction
    1. Current Status in Contemporary Organizations
    2. Why a Data Strategy Is Needed
      1. Value of Data as an Organizational Asset
    3. Vision and Goals of the Enterprise
      1. Support of the IT Strategy
    4. Components of a Data Strategy
      1. Data Integration
      2. Data Quality
      3. Metadata
      4. Data Modeling
      5. Organizational Roles and Responsibilities
      6. Performance and Measurement
      7. Security and Privacy
      8. DBMS Selection
      9. Business Intelligence
      10. Unstructured Data
      11. Business Value of Data and ROI
    5. How Will You Develop and Implement a Data Strategy?
      1. Data Environment Assessment
    6. References
  6. 2. Data Integration
    1. Ineffective “Silver-Bullet” Technology Solutions
      1. Enterprise Resource Planning (ERP)
      2. Data Warehousing (DW)
      3. Customer Relationship Management (CRM)
      4. Enterprise Application Integration (EAI)
    2. Gaining Management Support
      1. Business Case for Data Integration
    3. Integrating Business Data
      1. Know Your Business Entities
      2. Mergers and Acquisitions
      3. Data Redundancy
      4. Data Lineage
      5. Multiple DBMSs and Their Impact
    4. Deciding What Data Should Be Integrated
      1. Data Integration Prioritization
      2. Risks of Data Integration
    5. Consolidation and Federation
      1. Data Consolidation
      2. Data Federation
      3. Data Integration Strategy Capability Maturity Model
    6. Getting Started
    7. Conclusion
    8. References
      1. Bibliography
  7. 3. Data Quality
    1. Current State of Data Quality
    2. Recognizing Dirty Data
    3. Data Quality Rules
      1. Business Entity Rules
      2. Business Attribute Rules
      3. Data Dependency Rules
      4. Data Validity Rules
    4. Data Quality Improvement Practices
      1. Data Profiling
      2. Data Cleansing
      3. Data Defect Prevention
    5. Enterprise-Wide Data Quality Disciplines
      1. Data Quality Maturity Levels
      2. Standards and Guidelines
      3. Development Methodology
      4. Data Naming and Abbreviations
      5. Metadata
      6. Data Modeling
      7. Data Quality
      8. Testing
      9. Reconciliation
      10. Security
      11. Data Quality Metrics
    6. Enterprise Architecture
      1. Data Quality Improvement Process
    7. Business Sponsorship
      1. Business Responsibility for Data Quality
    8. Conclusion
    9. References
      1. Bibliography
  8. 4. Metadata
    1. Why Metadata Is Critical to the Business
      1. Metadata as the Keystone
      2. Management Support for Metadata
      3. Starting a Metadata Management Initiative
    2. Metadata Categories
      1. Business Metadata
      2. Technical Metadata
      3. Process Metadata
      4. Usage Metadata
    3. Metadata Sources
    4. Metadata Repository
      1. Buying a Metadata Repository Product
      2. Building a Metadata Repository
      3. Centralized Metadata Repository
      4. Distributed Metadata Repository
      5. XML-Enabled Metadata Repository
    5. Developing a Metadata Repository
      1. Justification
      2. Planning
      3. Analysis
      4. Design
      5. Construction
      6. Deployment
    6. Managed Metadata Environment
      1. Metadata Sourcing
      2. Metadata Integration
      3. Metadata Management
      4. Metadata Marts
      5. Metadata Delivery
      6. Communicating and Selling Metadata
    7. Conclusion
    8. References
      1. Bibliography
  9. 5. Data Modeling
    1. Origins of Data Modeling
    2. Significance of Data Modeling
    3. Logical Data Modeling Concepts
      1. Process-Independence
      2. Business-Focused Data Analysis
      3. Data Integration (Single Version of Truth)
      4. Data Quality
    4. Enterprise Logical Data Model
      1. Big-Bang Versus Incremental
        1. Big-Bang Pros and Cons
        2. Incremental Pros and Cons
      2. Top-Down versus Bottom-Up
        1. Top-Down Logical Data Modeling
        2. Bottom-Up Logical Data Modeling
    5. Physical Data Modeling Concepts
      1. Process-Dependence
      2. Database Design
    6. Physical Data Modeling Techniques
      1. Denormalization
      2. Surrogate Keys
      3. Indexing
      4. Partitioning
      5. Database Views
    7. Dimensionality
      1. Star Schema
      2. Snowflake
      3. Starflake
    8. Factors that Influence the Physical Data Model
      1. Guideline 1: High Degree of Normalization for Robustness
      2. Guideline 2: Denormalization for Short-Term Solutions
      3. Guideline 3: Usage of Views on Powerful Servers
      4. Guideline 4: Usage of Views on Powerful RDBMS Software
      5. Guideline 5: Cultural Influence on Database Design
      6. Guideline 6: Modeling Expertise Affects Database Design
      7. Guideline 7: User-Friendly Structures
      8. Guideline 8: Metric Facts Determine Database Design
      9. Guideline 9: When to Mimic Source Database Design
    9. Conclusion
    10. References
      1. Bibliography
  10. 6. Organizational Roles and Responsibilities
    1. Building the Teams Who Create and Maintain the Strategy
    2. Resistance to Change
      1. Existing Organization
      2. Resistance to Standards
      3. “Reasons” for Resistance
    3. Optimal Organizational Structures
      1. Distributed Organizations
      2. Outsourced Personnel
    4. Training
      1. Who Should Attend
      2. Mindset
      3. Choice of Class
      4. Timing
    5. Roles and Responsibilities
      1. Data Strategist
      2. Database Administrator
      3. Data Administrator
      4. Metadata Administrator
      5. Data Quality Steward
      6. Consultants and Contractors
      7. Security Officer
      8. Sharing Data
      9. Strategic Data Architect
      10. Technical Services
    6. Data Ownership
      1. Domains
      2. Security and Privacy
      3. Availability Requirements
      4. Timeliness and Periodicity Requirements
      5. Performance Requirements
      6. Data Quality Requirements
      7. Business Rules
    7. Information Stewardship
      1. Steward Deliverables
      2. Key Skills and Competencies
        1. Time Commitment
    8. Worst Practices
    9. Agenda for Weekly Data Strategy Team Meeting
    10. Conclusion
  11. 7. Performance
    1. Performance Requirements
    2. Service Level Agreements
      1. Response Time
    3. Capacity Planning: Performance Modeling
    4. Capacity Planning: Benchmarks
      1. Why Pursue a Benchmark?
      2. Benchmark Team
      3. Benefits of a Good Benchmark: Goals and Objectives
      4. Problems with “Standard” Benchmarks
      5. The Cost of Running a Benchmark
      6. Identifying and Securing Data
      7. Establishing Benchmark Criteria and Methodology
        1. Data Volume
        2. System Configuration
        3. Actual Test Data, Actual Queries
        4. Establishing Clear Success Criteria
        5. Availability
        6. Load Time
      8. Evaluating and Measuring Results
      9. Verifying and Reconciling Results
      10. Communicating Results Effectively
    5. Application Packages: Enterprise Resource Planning (ERPs)
    6. Designing, Coding, and Implementing
      1. Designing
      2. Coding
      3. Implementation
      4. Design Reviews
        1. Tables
        2. Queries and Reports
        3. Reporting
        4. Testing
        5. Operations
        6. Organization
        7. Communication
    7. Setting User Expectations
    8. Monitoring (Measurement)
      1. Conformance to Measures of Success
      2. Types of Metrics
      3. Responsibility for Measurement
      4. Means to Measure
      5. Use of Measurements
      6. Return on Investment (ROI)
      7. Reporting Results to Management
    9. Tuning
      1. Tuning Options
      2. Reporting Performance Results
      3. Selling Management on Performance
    10. Case Studies
    11. Performance Tasks
    12. Conclusion
    13. References
      1. Bibliography
  12. 8. Security and Privacy of Data
    1. Data Identification for Security and Privacy
      1. User Role
    2. Roles and Responsibilities
      1. Security Officer
      2. Data Owner
      3. System Administrator
    3. Regulatory Compliance
    4. Auditing Procedures
      1. Security Audits
      2. External Users of Your Data
    5. Design Solutions
      1. Database Controls
      2. Security Databases
      3. Test and Production Data
      4. Data Encryption
      5. Standards for Data Usage
    6. Impact of the Data Warehouse
    7. Vendor Issues
      1. Software
      2. External Data
    8. Communicating and Selling Security
      1. Security and Privacy Indoctrination
      2. Monitoring Employees
      3. Training
      4. Communication
    9. Best Practices and Worst Practices
    10. Identify Your Own Sensitive Data Exercise
    11. Conclusion
  13. 9. DBMS Selection
    1. Existing Environment
      1. Capabilities and Functions
    2. DBMS Choices
    3. Why Standardize the DBMS?
      1. Integration Problems
      2. Greater Staff Expense
      3. Software Expense
    4. Total Cost of Ownership
      1. Hardware
      2. Network Usage
      3. DBMS
      4. Consultants and Contractors
      5. Internal Staff
      6. Help Desk Support
      7. Operations and System Administration
      8. IT Training
    5. Application Packages and ERPs
    6. Criteria for Selection
    7. Selection Process
    8. Reference Checking
      1. Alternatives to Reference Checking
      2. Selecting and Gathering References
      3. Desired Types of References
      4. The Process of Reference Checking
      5. Questions to Ask
    9. RFPs for DBMSs
      1. RFP Best Practices
    10. Response Format
    11. Evaluating Vendors
    12. Dealing with the Vendor
      1. Performance
      2. Vendor's Level of Service
      3. Early Code
      4. Rules of Engagement
      5. Set the Agenda for Meetings and Presentations
      6. Professional Employee Information
      7. Financial Information
      8. Selection Matrix–—Categorize Capabilities and Functions
        1. Nice-to-Have Capabilities and Functions
    13. Exercise—How Well Are You Using Your DBMS?
    14. Conclusion
    15. References
      1. Bibliography
  14. 10. Business Intelligence
    1. What Is Business Intelligence?
      1. A Brief History
      2. Importance of BI
    2. BI Components
      1. Data Warehouse
      2. Metadata Repository
      3. Data Transformation and Cleansing
      4. OLAP and Analytics
      5. Data Presentation and Visualization
    3. Important BI Tools and Processes
      1. Data Mining
      2. Rule-Based Analytics
      3. Balanced Scorecard
      4. Digital Dashboard
    4. Emerging Trends and Technologies
      1. Mining Structured and Unstructured Data
      2. Radio Frequency Identification
    5. BI Myths and Pitfalls
    6. Conclusion
    7. References
      1. Bibliography
  15. 11. Strategies for Managing Unstructured Data
    1. What Is Unstructured Data?
      1. A Brief History
      2. Why Now?
      3. Current State of Unstructured Data in Organizations
    2. A Unified Content Strategy for the Organization
      1. Definition of a Unified Content Strategy
      2. Storage and Administration
        1. Archiving
        2. Retention
      3. Content Reusability
      4. Search and Delivery
      5. Combining Structured and Unstructured Data
    3. Emerging Technologies
      1. Digital Asset Management Software
      2. Digital Rights Management Software
      3. Electronic Medical Records
    4. Conclusion
    5. References
      1. Bibliography
  16. 12. Business Value of Data and ROI
    1. The Business Value of Data
      1. Companies that Sell Customer Data
      2. Internal Information Gathered About Customers
      3. Call Center Data
      4. Click-Stream Data
      5. Demographics
      6. Channel Preferences
      7. Direct Retailers
      8. Loyalty Cards
      9. Travel Data
    2. Align Data with Strategic Goals
      1. ROI Process
    3. The Cost of Developing a Data Strategy
      1. Data Warehouse
      2. Hardware
      3. Software
      4. Personnel Costs
        1. Internal Staff
        2. Consultants and Contractors
      5. Training
      6. Operations and System Administration
      7. Total Cost of Ownership
    4. Benefits of a Data Strategy
      1. The Data Warehouse
      2. Estimating Tangible Benefits
        1. Revenue Enhancement
        2. Cash Flow Acceleration
        3. Analyst Productivity
        4. Cost Containment
        5. Demand Chain Management
        6. Fraud Reduction
        7. Customer Conversion Rates
        8. Customer Attrition and Retention Rates
        9. Marketing Campaign Selection and Response Rates
        10. Better Relationships with Suppliers and Customers
        11. Data Mart Consolidation
      3. Estimating Intangible Benefits
        1. Public Relations, Reputation, and Impact on Shareholders
        2. Competitive Effectiveness
        3. Better and Faster Decisions
        4. Better Customer Service
        5. Employee Empowerment
      4. Post-Implementation Benefits Measurement
    5. Conclusion
    6. Reference
      1. Bibliography
  17. A. ROI Calculation Process, Cost Template, and Intangible Benefits Template
    1. Cost of Capital
    2. Risk
    3. ROI Example
      1. Net Present Value
      2. Internal Rate of Return
      3. Payback Period
    4. Cost Calculation Template
    5. Intangible Benefits Calculation Template
    6. Reference
      1. Bibliography
  18. B. Resources
    1. Publications
      1. Bibliography
    2. Websites
      1. Bibliography