O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Big Data Application Architecture Q&A: A Problem - Solution Approach

Book Description

Big Data Application Architecture Pattern Recipes provides an insight into heterogeneous infrastructures, databases, and visualization and analytics tools used for realizing the architectures of big data solutions. Its problem-solution approach helps in selecting the right architecture to solve the problem at hand. In the process of reading through these problems, you will learn harness the power of new big data opportunities which various enterprises use to attain real-time profits.

Big Data Application Architecture Pattern Recipes answers one of the most critical questions of this time 'how do you select the best end-to-end architecture to solve your big data problem?'.

The book deals with various mission critical problems encountered by solution architects, consultants, and software architects while dealing with the myriad options available for implementing a typical solution, trying to extract insight from huge volumes of data in real-time and across multiple relational and non-relational data types for clients from industries like retail, telecommunication, banking, and insurance. The patterns in this book provide the strong architectural foundation required to launch your next big data application.

The architectures for realizing these opportunities are based on relatively less expensive and heterogeneous infrastructures compared to the traditional monolithic and hugely expensive options that exist currently. This book describes and evaluates the benefits of heterogeneity which brings with it multiple options of solving the same problem, evaluation of trade-offs and validation of 'fitness-for-purpose' of the solution.

What you'll learn

  • Major considerations in building a big data solution

  • Big data application architectures problems for specific industries

  • What are the components one needs to build and end-to-end big data solution?

  • Does one really need a real-time big data solution or an off-line analytics batch solution?

  • What are the operations and support architectures for a big data solution?

  • What are the scalability considerations, and options for a Hadoop installation?

  • Who this book is for

  • CIOs, CTOs, enterprise architects, and software architects

  • Consultants, solution architects, and information management (IM) analysts who want to architect a big data solution for their enterprise

  • Table of Contents

    1. Title
    2. Contents at a Glance
    3. Contents
    4. About the Authors
    5. About the Technical Reviewer
    6. Acknowledgments
    7. Introduction
    8. CHAPTER 1: Big Data Introduction
      1. Why Big Data
      2. Aspects of Big Data
      3. How Big Data Differs from Traditional BI
      4. How Big Is the Opportunity?
      5. Deriving Insight from Data
      6. Cloud Enabled Big Data
      7. Structured vs. Unstructured Data
      8. Analytics in the Big Data World
      9. Big Data Challenges
      10. Defining a Reference Architecture
      11. Need for Architecture Patterns
      12. Summary
    9. CHAPTER 2: Big Data Application Architecture
      1. Architecting the Right Big Data Solution
      2. Data Sources
      3. Ingestion Layer
      4. Distributed (Hadoop) Storage Layer
      5. Hadoop Infrastructure Layer
      6. Hadoop Platform Management Layer
      7. Security Layer
      8. Monitoring Layer
      9. Analytics Engine
      10. Search Engines
      11. Real-Time Engines
      12. Visualization Layer
      13. Big Data Applications
      14. Summary
    10. CHAPTER 3: Big Data Ingestion and Streaming Patterns
      1. Understanding Data Ingestion
      2. Multisource Extractor Pattern
      3. Protocol Converter Pattern
      4. Multidestination Pattern
      5. Just-in-Time Transformation Pattern
      6. Real-Time Streaming Pattern
      7. ETL Tools for Big Data
      8. Summary
    11. CHAPTER 4: Big Data Storage Patterns
      1. Understanding Big Data Storage
      2. Façade Pattern
      3. Data Appliances
      4. Storage Disks
      5. Data Archive/Purge
      6. Data Partitioning/Indexing and the Lean Pattern
      7. HDFS Alternatives
      8. NoSQL Pattern
      9. Polyglot Pattern
      10. Big Data Storage Infrastructure
      11. Typical Data-Node Configuration
      12. Summary
    12. CHAPTER 5: Big Data Access Patterns
      1. Understanding Big Data Access
      2. Stage Transform Pattern
      3. Connector Pattern
      4. Near Real-Time Access Pattern
      5. Lightweight Stateless Pattern
      6. Service Locator Pattern
      7. Rapid Data Analysis
      8. Secure Data Access
      9. Summary
    13. CHAPTER 6: Data Discovery and Analysis Patterns
      1. Data Queuing Pattern
      2. Index based Insight Pattern
      3. Constellation Search Pattern
      4. Machine Learning Recommendation Pattern
      5. Converger Pattern
      6. Challenges in Big Data Analysis
      7. Log File Analysis
      8. Sentiment Analysis
      9. Data Analysis as a Service (DaaS)
      10. Summary
    14. CHAPTER 7: Big Data Visualization Patterns
      1. Introduction to Big Visualization
      2. Big Data Analysis Patterns
      3. Mashup View Pattern
      4. Compression Pattern
      5. Zoning Pattern
      6. First Glimpse Pattern
      7. Exploder Pattern
      8. Portal Pattern
      9. Service Facilitator Pattern
      10. Summary
    15. CHAPTER 8: Big Data Deployment Patterns
      1. Big Data Infrastructure: Hybrid Architecture Patterns
      2. Traditional Tree Network Pattern
      3. Resource Negotiator Pattern for Security and Data Integrity
      4. Spine Fabric Pattern
      5. Federation Pattern
      6. Lean DevOps Pattern
      7. Big Data on the Cloud and Hybrid Architecture
      8. Big Data Operations
      9. Summary
    16. CHAPTER 9: Big Data NFRs
      1. “ilities”
      2. Security
      3. Parallel Exhaust Pattern
      4. Variety Abstraction Pattern
      5. Real-Time Streaming Using the Appliance Pattern
      6. Distributed Search Optimization Access Pattern
      7. Anything as an API Pattern
      8. Security Challenges
      9. Operability
      10. Big Data System Security Audit
      11. Big Data Security Products
      12. Summary
    17. CHAPTER 10: Big Data Case Studies
      1. Case Study: Mainframe to Hadoop-Based NoSQL Database
      2. Case Study: Geo-Redundancy and Near-Real-Time Data Ingestion
      3. Case Study: Recommendation Engine
      4. Case Study: Video-Streaming Analytics
      5. Case Study: Sentiment Analysis and Log Processing
      6. Case Study: Real-Time Traffic Monitoring
      7. Case Study: Data Exploration for Suspicious Behavior on a Stock Exchange
      8. Case Study: Environment Change Detection
      9. Summary
    18. CHAPTER 11: Resources, References, and Tools
      1. Big Data Product Catalog
      2. Hadoop Distributions
      3. In-memory Hadoop
      4. Hadoop Alternatives
      5. Hadoop SQL Interfaces
      6. Ingestion tools
      7. Map Reduce alternatives
      8. Cloud Options
      9. Table-Style Database Management Services
      10. NoSQL Databases
      11. In-Memory Big Data Management Systems
      12. DataSets
      13. Data Discovery
      14. Visualization
      15. Analytics Tools
      16. Data Integration Tools
      17. Summary
    19. APPENDIX A: References and Bibliography
    20. Index