O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Mastering Azure Analytics

Book Description

Microsoft Azure has over 20 platform-as-a-service (PaaS) offerings that can act in support of a big data analytics solution. So which one is right for your project? This practical book helps you understand the breadth of Azure services by organizing them into a reference framework you can use when crafting your own big data analytics solution. You’ll not only be able to determine which service best fits the job, but also learn how to implement a complete solution that scales, provides human fault tolerance, and supports future needs.

Table of Contents

  1. Preface
    1. Conventions Used in This Book
    2. Using Code Examples
    3. O’Reilly Safari
    4. How to Contact Us
    5. Acknowledgments
  2. 1. Enterprise Analytics Fundamentals
    1. The Analytics Data Pipeline
    2. Data Lakes
    3. Lambda Architecture
    4. Kappa Architecture
    5. Choosing between Lambda and Kappa
    6. The Azure Analytics Pipeline
    7. Introducing the Analytics Scenarios
    8. Sample code and sample datasets
    9. What you will need
      1. Broadband Internet Connectivity
      2. Azure Subscription
      3. Visual Studio 2015 with Update 1
      4. Azure SDK 2.8 or later
    10. Chapter Summary
  3. 2. Getting Data into Azure
    1. Ingest Loading Layer
    2. Bulk Data Loading
      1. Disk Shipping
      2. End User Tools
      3. Network Oriented Approaches
    3. Stream Loading
      1. Stream Loading with Event Hubs
    4. Chapter Summary
  4. 3. Storing Ingested Data in Azure
    1. File Oriented Storage
      1. Blob Storage
      2. Azure Data Lake Store
      3. HDFS
    2. Queue Oriented Storage
      1. Blue Yonder Scenario: Smart Buildings
      2. Event Hubs
      3. IoT Hub
    3. Chapter Summary
  5. 4. Real-time Processing in Azure
    1. Stream Processing
      1. Consuming Messages from Event Hubs
    2. Tuple-at-a-Time Processing in Azure
      1. Introducing HDInsight
      2. Storm on HDInsight
      3. EventProcessorHost
      4. Azure Machine Learning
    3. Summary
  6. 5. Real-Time Micro-Batch Processing in Azure
    1. Micro-batch Processing in Azure
      1. Spark Streaming on HDInsight
      2. Storm on HDInsight
      3. Azure Stream Analytics
    2. Summary
  7. 6. Batch Processing in Azure
    1. Batch Processing with MapReduce on HDInsight
    2. Batch Processing with Hive on HDInsight
    3. Batch Processing with Pig on HDInsight
    4. Batch Processing with Spark on HDInsight
    5. Batch Processing with SQL Data Warehouse
    6. Batch Processing with Data Lake Analytics
    7. Batch Processing with Azure Batch
    8. Orchestrating Batch Processing Pipelines with Azure Data Factory
    9. Summary
  8. 7. Interactive Querying in Azure
    1. Interactive Querying with Azure SQL Data Warehouse
    2. Interactive Querying with Hive and Tez
    3. Interactive Querying with Spark SQL
    4. Interactive Querying with USQL
    5. Summary
  9. 8. Hot & Cold Path Serving Layer in Azure
    1. Azure Redis Cache
    2. Document DB
    3. SQL Database
    4. SQL Data Warehouse
    5. HBase on HDInsight
    6. Azure Search
    7. Summary
  10. 9. Intelligence & Machine Learning
    1. Azure Machine Learning
    2. R Server on HDInsight
    3. SQL R Services
    4. Microsoft Cognitive Services
    5. Summary
  11. 10. Managing Metadata in Azure
    1. Managing Metadata with Azure Data Catalog
    2. Summary
  12. 11. Protecting Your Data in Azure
    1. Identity and Access Management
    2. Data Protection
    3. Auditing
    4. Summary
  13. 12. Performing Analytics
    1. Analytics with Power BI
    2. A Look Ahead
  14. Index