You are previewing Data Warehousing in the Age of Big Data.
O'Reilly logo
Data Warehousing in the Age of Big Data

Book Description

Data Warehousing in the Age of the Big Data will help you and your organization make the most of unstructured data with your existing data warehouse.

As Big Data continues to revolutionize how we use data, it doesn't have to create more confusion. Expert author Krish Krishnan helps you make sense of how Big Data fits into the world of data warehousing in clear and concise detail. The book is presented in three distinct parts. Part 1 discusses Big Data, its technologies and use cases from early adopters. Part 2 addresses data warehousing, its shortcomings, and new architecture options, workloads, and integration techniques for Big Data and the data warehouse. Part 3 deals with data governance, data visualization, information life-cycle management, data scientists, and implementing a Big Data–ready data warehouse. Extensive appendixes include case studies from vendor implementations and a special segment on how we can build a healthcare information factory.

Ultimately, this book will help you navigate through the complex layers of Big Data and data warehousing while providing you information on how to effectively think about using all these technologies and the architectures to design the next-generation data warehouse.

  • Learn how to leverage Big Data by effectively integrating it into your data warehouse.
  • Includes real-world examples and use cases that clearly demonstrate Hadoop, NoSQL, HBASE, Hive, and other Big Data technologies
  • Understand how to optimize and tune your current data warehouse infrastructure and integrate newer infrastructure matching data processing workloads and requirements

Table of Contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Dedication
  6. Acknowledgments
  7. About the Author
  8. Introduction
    1. Part 1: Big Data
    2. Part 2: The Data Warehousing
    3. Part 3: Building the Big Data – Data Warehouse
    4. Appendixes
    5. Companion website
  9. Part 1: Big Data
    1. Chapter 1. Introduction to Big Data
      1. Introduction
      2. Big Data
      3. Defining Big Data
      4. Why Big Data and why now?
      5. Big Data example
      6. Summary
      7. Further reading
    2. Chapter 2. Working with Big Data
      1. Introduction
      2. Data explosion
      3. Data volume
      4. Data velocity
      5. Data variety
      6. Summary
    3. Chapter 3. Big Data Processing Architectures
      1. Introduction
      2. Data processing revisited
      3. Data processing techniques
      4. Data processing infrastructure challenges
      5. Shared-everything and shared-nothing architectures
      6. Big Data processing
      7. Telco Big Data study
    4. Chapter 4. Introducing Big Data Technologies
      1. Introduction
      2. Distributed data processing
      3. Big Data processing requirements
      4. Technologies for Big Data processing
      5. Hadoop
      6. NoSQL
      7. Textual ETL processing
      8. Further reading
    5. Chapter 5. Big Data Driving Business Value
      1. Introduction
      2. Case study 1: Sensor data
      3. Case study 2: Streaming data
      4. Case study 3: The right prescription: improving patient outcomes with Big Data analytics
      5. Case study 4: University of Ontario, institute of technology: leveraging key data to provide proactive patient care
      6. Case study 5: Microsoft SQL server customer solution
      7. Case study 6: Customer-centric data integration
      8. Summary
  10. Part 2: The Data Warehousing
    1. Chapter 6. Data Warehousing Revisited
      1. Introduction
      2. Traditional data warehousing, or data warehousing 1.0
      3. Data warehouse 2.0
      4. Summary
      5. Further reading
    2. Chapter 7. Reengineering the Data Warehouse
      1. Introduction
      2. Enterprise data warehouse platform
      3. Choices for reengineering the data warehouse
      4. Modernizing the data warehouse
      5. Case study of data warehouse modernization
      6. Summary
    3. Chapter 8. Workload Management in the Data Warehouse
      1. Introduction
      2. Current state
      3. Defining workloads
      4. Understanding workloads
      5. Query classification
      6. ETL and CDC workloads
      7. Measurement
      8. Current system design limitations
      9. New workloads and Big Data
      10. Technology choices
      11. Summary
    4. Chapter 9. New Technologies Applied to Data Warehousing
      1. Introduction
      2. Data warehouse challenges revisited
      3. Data warehouse appliance
      4. Cloud computing
      5. Data virtualization
      6. Summary
      7. Further reading
  11. Part 3: Building the Big Data – Data Warehouse
    1. Chapter 10. Integration of Big Data and Data Warehousing
      1. Introduction
      2. Components of the new data warehouse
      3. Integration strategies
      4. Hadoop & RDBMS
      5. Big Data appliances
      6. Data virtualization
      7. Semantic framework
      8. Summary
    2. Chapter 11. Data-Driven Architecture for Big Data
      1. Introduction
      2. Metadata
      3. Master data management
      4. Processing data in the data warehouse
      5. Processing complexity of Big Data
      6. Machine learning
      7. Summary
    3. Chapter 12. Information Management and Life Cycle for Big Data
      1. Introduction
      2. Information life-cycle management
      3. Information life-cycle management for Big Data
      4. Summary
    4. Chapter 13. Big Data Analytics, Visualization, and Data Scientists
      1. Introduction
      2. Big Data analytics
      3. Data discovery
      4. Visualization
      5. The evolving role of data scientists
      6. Summary
    5. Chapter 14. Implementing the Big Data – Data Warehouse – Real-Life Situations
      1. Introduction: Building the Big Data – Data Warehouse
      2. Customer-centric business transformation
      3. Hadoop and MySQL drives innovation
      4. Integrating Big Data into the data warehouse
      5. Summary
  12. Appendix A. Customer Case Studies
    1. Introduction
    2. Case study 1: Transforming marketing landscape
    3. Case study 2: Streamlining healthcare connectivity with Big Data
    4. Case study 3: Improving healthcare quality and costs using Big Data
    5. Case study 4: Improving customer support
    6. Case study 5: Driving customer-centric transformations
    7. Case study 6: Quantifying risk and compliance
    8. Case study 7: Delivering a 360° view of customers
  13. Appendix B. Building the Healthcare Information Factory: Healthcare Information Factory: Implementing Textual Analytics
    1. Introduction
    2. Executive summary
    3. The healthcare information factory
    4. A visionary architecture
    5. Separate systems
    6. A common patient identifier
    7. Integrating data
    8. The larger issue of integration across many data types
    9. ETL and the collective common data warehouse
    10. Common elements of a data warehouse
    11. Analytical processing
    12. DSS/business intelligence processing
    13. Different types of data that go into the data warehouse
    14. Textual data
    15. The system of record
    16. Metadata
    17. Local individual data warehouses
    18. Data models and the healthcare information factory
    19. Creating the medical data warehouse data model
    20. The collective common data model
    21. Developing the healthcare information factory
    22. Healthcare information factory users
    23. Other healthcare entities
    24. Financing the infrastructure
    25. The age of data in the healthcare information factory
    26. Implementing the healthcare information factory
    27. Summary
    28. Further reading
  14. Summary
  15. Index