O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Big Data Application Architecture Q&A: A Problem - Solution Approach

Book Description

Big Data Application Architecture Pattern Recipes provides an insight into heterogeneous infrastructures, databases, and visualization and analytics tools used for realizing the architectures of big data solutions. Its problem-solution approach helps in selecting the right architecture to solve the problem at hand. In the process of reading through these problems, you will learn harness the power of new big data opportunities which various enterprises use to attain real-time profits.

Big Data Application Architecture Pattern Recipes answers one of the most critical questions of this time 'how do you select the best end-to-end architecture to solve your big data problem?'.

The book deals with various mission critical problems encountered by solution architects, consultants, and software architects while dealing with the myriad options available for implementing a typical solution, trying to extract insight from huge volumes of data in real-time and across multiple relational and non-relational data types for clients from industries like retail, telecommunication, banking, and insurance. The patterns in this book provide the strong architectural foundation required to launch your next big data application.

The architectures for realizing these opportunities are based on relatively less expensive and heterogeneous infrastructures compared to the traditional monolithic and hugely expensive options that exist currently. This book describes and evaluates the benefits of heterogeneity which brings with it multiple options of solving the same problem, evaluation of trade-offs and validation of 'fitness-for-purpose' of the solution.

What you'll learn

  • Major considerations in building a big data solution

  • Big data application architectures problems for specific industries

  • What are the components one needs to build and end-to-end big data solution?

  • Does one really need a real-time big data solution or an off-line analytics batch solution?

  • What are the operations and support architectures for a big data solution?

  • What are the scalability considerations, and options for a Hadoop installation?

Who this book is for

  • CIOs, CTOs, enterprise architects, and software architects

  • Consultants, solution architects, and information management (IM) analysts who want to architect a big data solution for their enterprise

Table of Contents

  1. Title
  2. Contents at a Glance
  3. Contents
  4. About the Authors
  5. About the Technical Reviewer
  6. Acknowledgments
  7. Introduction
  8. CHAPTER 1: Big Data Introduction
    1. Why Big Data
    2. Aspects of Big Data
    3. How Big Data Differs from Traditional BI
    4. How Big Is the Opportunity?
    5. Deriving Insight from Data
    6. Cloud Enabled Big Data
    7. Structured vs. Unstructured Data
    8. Analytics in the Big Data World
    9. Big Data Challenges
    10. Defining a Reference Architecture
    11. Need for Architecture Patterns
    12. Summary
  9. CHAPTER 2: Big Data Application Architecture
    1. Architecting the Right Big Data Solution
    2. Data Sources
    3. Ingestion Layer
    4. Distributed (Hadoop) Storage Layer
    5. Hadoop Infrastructure Layer
    6. Hadoop Platform Management Layer
    7. Security Layer
    8. Monitoring Layer
    9. Analytics Engine
    10. Search Engines
    11. Real-Time Engines
    12. Visualization Layer
    13. Big Data Applications
    14. Summary
  10. CHAPTER 3: Big Data Ingestion and Streaming Patterns
    1. Understanding Data Ingestion
    2. Multisource Extractor Pattern
    3. Protocol Converter Pattern
    4. Multidestination Pattern
    5. Just-in-Time Transformation Pattern
    6. Real-Time Streaming Pattern
    7. ETL Tools for Big Data
    8. Summary
  11. CHAPTER 4: Big Data Storage Patterns
    1. Understanding Big Data Storage
    2. Façade Pattern
    3. Data Appliances
    4. Storage Disks
    5. Data Archive/Purge
    6. Data Partitioning/Indexing and the Lean Pattern
    7. HDFS Alternatives
    8. NoSQL Pattern
    9. Polyglot Pattern
    10. Big Data Storage Infrastructure
    11. Typical Data-Node Configuration
    12. Summary
  12. CHAPTER 5: Big Data Access Patterns
    1. Understanding Big Data Access
    2. Stage Transform Pattern
    3. Connector Pattern
    4. Near Real-Time Access Pattern
    5. Lightweight Stateless Pattern
    6. Service Locator Pattern
    7. Rapid Data Analysis
    8. Secure Data Access
    9. Summary
  13. CHAPTER 6: Data Discovery and Analysis Patterns
    1. Data Queuing Pattern
    2. Index based Insight Pattern
    3. Constellation Search Pattern
    4. Machine Learning Recommendation Pattern
    5. Converger Pattern
    6. Challenges in Big Data Analysis
    7. Log File Analysis
    8. Sentiment Analysis
    9. Data Analysis as a Service (DaaS)
    10. Summary
  14. CHAPTER 7: Big Data Visualization Patterns
    1. Introduction to Big Visualization
    2. Big Data Analysis Patterns
    3. Mashup View Pattern
    4. Compression Pattern
    5. Zoning Pattern
    6. First Glimpse Pattern
    7. Exploder Pattern
    8. Portal Pattern
    9. Service Facilitator Pattern
    10. Summary
  15. CHAPTER 8: Big Data Deployment Patterns
    1. Big Data Infrastructure: Hybrid Architecture Patterns
    2. Traditional Tree Network Pattern
    3. Resource Negotiator Pattern for Security and Data Integrity
    4. Spine Fabric Pattern
    5. Federation Pattern
    6. Lean DevOps Pattern
    7. Big Data on the Cloud and Hybrid Architecture
    8. Big Data Operations
    9. Summary
  16. CHAPTER 9: Big Data NFRs
    1. “ilities”
    2. Security
    3. Parallel Exhaust Pattern
    4. Variety Abstraction Pattern
    5. Real-Time Streaming Using the Appliance Pattern
    6. Distributed Search Optimization Access Pattern
    7. Anything as an API Pattern
    8. Security Challenges
    9. Operability
    10. Big Data System Security Audit
    11. Big Data Security Products
    12. Summary
  17. CHAPTER 10: Big Data Case Studies
    1. Case Study: Mainframe to Hadoop-Based NoSQL Database
    2. Case Study: Geo-Redundancy and Near-Real-Time Data Ingestion
    3. Case Study: Recommendation Engine
    4. Case Study: Video-Streaming Analytics
    5. Case Study: Sentiment Analysis and Log Processing
    6. Case Study: Real-Time Traffic Monitoring
    7. Case Study: Data Exploration for Suspicious Behavior on a Stock Exchange
    8. Case Study: Environment Change Detection
    9. Summary
  18. CHAPTER 11: Resources, References, and Tools
    1. Big Data Product Catalog
    2. Hadoop Distributions
    3. In-memory Hadoop
    4. Hadoop Alternatives
    5. Hadoop SQL Interfaces
    6. Ingestion tools
    7. Map Reduce alternatives
    8. Cloud Options
    9. Table-Style Database Management Services
    10. NoSQL Databases
    11. In-Memory Big Data Management Systems
    12. DataSets
    13. Data Discovery
    14. Visualization
    15. Analytics Tools
    16. Data Integration Tools
    17. Summary
  19. APPENDIX A: References and Bibliography
  20. Index