You are previewing Resilience and Reliability on AWS.

Resilience and Reliability on AWS

Cover of Resilience and Reliability on AWS by Jasper Geurtsen... Published by O'Reilly Media, Inc.
  1. Resilience & Reliability on AWS
  2. Foreword
  3. Preface
    1. Audience
    2. Conventions Used in This Book
    3. Using Code Examples
    4. Safari® Books Online
    5. How to Contact Us
    6. Acknowledgments
  4. 1. Introduction
  5. 2. The Road to Resilience and Reliability
    1. Once Upon a Time, There Was a Mason
    2. Rip. Mix. Burn.
    3. Cradle to Cradle
    4. In Short
  6. 3. Crash Course in AWS
    1. Regions and Availability Zones
      1. Route 53: Domain Name System Service
      2. IAM (Identity and Access Management)
    2. The Basics: EC2, RDS, ElastiCache, S3, CloudFront, SES, and CloudWatch
      1. CloudWatch
      2. EC2 (et al.)
      3. RDS
      4. ElastiCache
      5. S3/CloudFront
      6. SES
    3. Growing Up: ELB, Auto Scaling
      1. ELB (Elastic Load Balancer)
      2. Auto Scaling
    4. Decoupling: SQS, SimpleDB & DynamoDB, SNS, SWF
      1. SQS (Simple Queue Service)
      2. SimpleDB
      3. SNS (Simple Notification Service)
      4. SWF (Simple Workflow Service)
  7. 4. Top 10 Survival Tips
    1. Make a Choice
    2. Embrace Change
    3. Everything Will Break
    4. Know Your Enemy
    5. Know Yourself
    6. Engineer for Today
    7. Question Everything
    8. Don’t Waste
    9. Learn from Others
    10. You Are Not Alone
  8. 5. elasticsearch
    1. Introduction
    2. EC2 Plug-in
    3. Missing Features
    4. Conclusion
  9. 6. Postgres
    1. Pragmatism First
    2. The Challenge
      1. Tablespaces
    3. Building Blocks
      1. Configuration with userdata
      2. IAM Policies (Identity and Access Management)
      3. Postgres Persistence (backup/restore)
      4. Self Reliance
    4. Monitoring
    5. Conclusion
  10. 7. MongoDB
    1. How It Works
      1. Replica Set
      2. Backups
    2. Auto Scaling
    3. Monitoring
    4. Conclusion
  11. 8. Redis
    1. The Problem
    2. Our Approach
    3. Implementation
      1. userdata
      2. Redis
      3. Chaining (Replication)
    4. In Practice
  12. 9. Logstash
    1. Build
    2. Shipper
      1. Output Plug-in
    3. Reader
      1. Input Plug-in
      2. Grok
    4. Kibana
  13. 10. Global (Content) Delivery
    1. CloudFront
      1. (Live) Streaming
      2. CloudFormation
      3. Orchestration
    2. Route 53
      1. Global Database
  14. 11. Conclusion
  15. Copyright
O'Reilly logo

Chapter 5. elasticsearch

Most of the technology stacks we use are not yet designed for the cloud. In Ubuntu (Linux) every so often you have to jump through hoops to get your root volume checked. Replication in most databases, synchronous or asynchronous, is notoriously hard to get right. And once running, scaling up resources takes an extreme amount of courage.

Most of these problems make your systems less resilient and reliable, because you just can’t be flexible with resources.

elasticsearch is the first infrastructural component that gets it right. They really understand what it takes to operate datastores (it is much more than search). And therefore this is the first example we’ll talk about.

Introduction

“It is an Open Source (Apache 2), Distributed, RESTful, Search Engine built on top of Apache Lucene.”

The operational unit of elasticsearch is a cluster, not a server. Note that this is already different from many other datastore technologies. You can have a cluster of one, for development or test, or for transient data. But in production you will want to have at least two nodes most of the time.

elasticsearch holds json data in indexes. These indexes are broken up into shards. If you have a cluster with multiple nodes, shards are distributed in such a way that you can lose a node. You can manipulate almost everything in elasticsearch, so changing the sharding is not too difficult.

To add a document to an index (if the index doesn’t exist it is created): 

$ curl -XPOST 'http://elasticsearch.heystaq.com:9200/heystaq/snapshot/?pretty=true' ...

The best content for your career. Discover unlimited learning on demand for around $1/day.