O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Mastering Elastic Stack

Book Description

Get the most out of the Elastic Stack for various complex analytics using this comprehensive and practical guide

About This Book

  • Your one-stop solution to perform advanced analytics with Elasticsearch, Logstash, and Kibana

  • Learn how to make better sense of your data by searching, analyzing, and logging data in a systematic way

  • This highly practical guide takes you through an advanced implementation on the ELK stack in your enterprise environment

  • Who This Book Is For

    This book cater to developers using the Elastic stack in their day-to-day work who are familiar with the basics of Elasticsearch, Logstash, and Kibana, and now want to become an expert at using the Elastic stack for data analytics.

    What You Will Learn

  • Build a pipeline with help of Logstash and Beats to visualize Elasticsearch data in Kibana

  • Use Beats to ship any type of data to the Elastic stack

  • Understand Elasticsearch APIs, modules, and other advanced concepts

  • Explore Logstash and it's plugins

  • Discover how to utilize the new Kibana UI for advanced analytics

  • See how to work with the Elastic Stack using other advanced configurations

  • Customize the Elastic Stack and plugin development for each of the component

  • Work with the Elastic Stack in a production environment

  • Explore the various components of X-Pack in detail.

  • In Detail

    Even structured data is useless if it can't help you to take strategic decisions and improve existing system. If you love to play with data, or your job requires you to process custom log formats, design a scalable analysis system, and manage logs to do real-time data analysis, this book is your one-stop solution. By combining the massively popular Elasticsearch, Logstash, Beats, and Kibana, elastic.co has advanced the end-to-end stack that delivers actionable insights in real time from almost any type of structured or unstructured data source. If your job requires you to process custom log formats, design a scalable analysis system, explore a variety of data, and manage logs, this book is your one-stop solution. You will learn how to create real-time dashboards and how to manage the life cycle of logs in detail through real-life scenarios.

    This book brushes up your basic knowledge on implementing the Elastic Stack and then dives deeper into complex and advanced implementations of the Elastic Stack. We'll help you to solve data analytics challenges using the Elastic Stack and provide practical steps on centralized logging and real-time analytics with the Elastic Stack in production. You will get to grip with advanced techniques for log analysis and visualization. Newly announced features such as Beats and X-Pack are also covered in detail with examples.

    Toward the end, you will see how to use the Elastic stack for real-world case studies and we'll show you some best practices and troubleshooting techniques for the Elastic Stack.

    Style and approach

    This practical guide shows you how to perform advanced analytics with the Elastic stack through real-world use cases. It includes common and some not so common scenarios to use the Elastic stack for data analysis.

    Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

    Table of Contents

    1. Mastering Elastic Stack
      1. Mastering Elastic Stack
      2. Credits
      3. About the Authors
      4. About the Reviewer
      5. www.PacktPub.com
        1. Why subscribe?
      6. Customer Feedback
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. Questions
      8. 1. Elastic Stack Overview
        1. Introduction to ELK Stack
          1. Logstash
          2. Elasticsearch
          3. Kibana
        2. The birth of Elastic Stack
          1. Beat
        3. Who uses Elastic Stack?
          1. Salesforce
          2. CERN
          3. Green Man Gaming
        4. Stack competitors
        5. Setting up Elastic Stack
          1. Installation of Java
            1. Installation of Java on Ubuntu 14.04
            2. Installation of Java on Windows
          2. Installation of Elasticsearch
            1. Installation of Elasticsearch on Ubuntu 14.04
            2. Installation of Elasticsearch on Windows
              1. Installation of Elasticsearch as a service
          3. Installation of Kibana
            1. Installation of Kibana on Ubuntu 14.04
            2. Installation of Kibana on Windows
          4. Installation of Logstash
            1. Installation of Logstash on Ubuntu 14.04
            2. Installation of Logstash on Windows
          5. Installation of Filebeat
            1. Installation of Filebeat on Ubuntu 14.04
            2. Installation of Filebeat on Windows
        6. X-Pack
        7. Summary
      9. 2. Stepping into Elasticsearch
        1. The beginning of Elasticsearch
          1. Key features
        2. Understanding the architecture
          1. Recommended cluster configurations
            1. Minimum master nodes
            2. Local cluster settings
          2. Understanding document processing
        3. Elasticsearch APIs
          1. Document APIs
            1. Single document APIs
              1. Index API
              2. Get API
              3. Delete API
              4. Update API
            2. Multi-document APIs
              1. Multi-get API
              2. Bulk API
          2. Search APIs
            1. Search API
              1. Query parameters
            2. Search shard API
            3. Multi-search APIs
            4. Count API
            5. Validate API
            6. Explain API
            7. Profile API
            8. Field stat API
          3. Indices APIs
            1. Managing indices
              1. Creating an index
              2. Checking if an index exists
              3. Getting index information
              4. Managing index settings
              5. Getting index stats
              6. Getting index segments
              7. Getting index recovery information
              8. Getting shard stores information
              9. Index aliases
              10. Mappings
              11. Closing, opening, and deleting an index
            2. Other operations
          4. Cat APIs
          5. Cluster APIs
        4. Query DSL
        5. Aggregations
          1. Bucket
          2. Metrics aggregations
            1. Avg aggregation
            2. Min aggregation
            3. Max aggregation
            4. Percentiles Aggregation
            5. Sum aggregation
            6. Value count aggregation
            7. Cardinality aggregation
            8. Stats aggregation
            9. Extended stats aggregation
        6. A note for painless scripting
        7. Summary
      10. 3. Exploring Logstash and Its Plugins
        1. Introduction to Logstash
        2. Why do we need Logstash?
        3. Features of Logstash
        4. Logstash Plugin Architecture
        5. Logstash Configuration File Structure
          1. Value types
            1. Array
            2. Boolean
            3. Bytes
            4. Codec
            5. Comments
            6. Hash
            7. Number
            8. String
          2. Use of Conditionals
        6. Types of Plugins
          1. Input plugins
          2. Filter plugins
          3. Output plugins
          4. Codec plugins
        7. Exploring Input Plugins
          1. stdin
          2. file
          3. path
          4. udp
        8. Exploring Filter Plugins
          1. grok
          2. mutate
          3. csv
        9. Exploring Output Plugins
          1. stdout
          2. file
          3. elasticsearch
        10. Exploring Codec Plugins
          1. rubydebug
          2. json
          3. avro
          4. multiline
        11. Plugins Command-Line Options
          1. Listing of Plugins
          2. Installing a plugin
          3. Removing a plugin
          4. Updating a plugin
          5. Packing a plugin
          6. Unpacking a plugin
        12. Logstash command-line options
        13. Logstash Tips and Tricks
          1. Referencing fields and Its values
          2. Adding custom-created grok patterns
          3. Logstash does not show any output
            1. When an input file has already been completely read
            2. When an input file is not modified since 1 day
        14. Logstash Configuration for Parsing Logs
          1. Sample Catalina logs
          2. Sample Tomcat logs
          3. Grok pattern for Catalina logs
          4. Grok pattern for Tomcat logs
          5. Logstash configuration file
        15. Monitoring APIs
          1. Node info API
            1. OS Info
            2. JVM info
            3. Pipleine Info
          2. Plugins Info API
          3. Node stats API
            1. JVM stats
            2. Process stats
            3. Pipeline stats
          4. Hot threads API
            1. Threads
            2. Human
            3. Ignore idle threads
        16. Summary
      11. 4. Kibana Interface
        1. Kibana and its offerings
          1. Kibana interface
        2. Exploring the discover interface
        3. Time Filter
          1. Quick time filter
          2. Relative time filter
          3. Absolute time filter
          4. Auto-refresh
        4. Querying and Searching data
          1. Full-text searches
          2. Range searches
          3. Boolean searches
          4. Proximity search
          5. Wildcard searches
          6. Regular expressions search
          7. Grouping
        5. Fields and filters
          1. Filtering the field
          2. Functionalities of filters
        6. Discovery page options
        7. Exploring the visualize interface
          1. Understanding aggregations
            1. Bucket aggregations
            2. Metric aggregations
          2. Visualization Canvas
          3. Area chart
          4. Data table
          5. Line chart
          6. Bubble chart
          7. Markdown widget
          8. Metric
          9. Pie chart
          10. Tag clouds
          11. Tile map
          12. Time series
          13. Vertical bar chart
        8. Exploring the Dashboard interface
        9. Understanding Timelion
        10. Exploring Dev Tools
        11. Exploring the Management interface
          1. Index patterns
          2. Saved objects
          3. Advanced Settings
          4. Status
        12. Putting it all together
          1. Input data
          2. Creating a Logstash configuration file
          3. Using Kibana
            1. Top states based on 2003 RUCC
            2. Top states based on 2003 UIC
            3. Top five area names with less than high school diploma 1970
            4. Top five area names with high school diploma 1970
            5. Percentage of adults having less than high school diploma in 1970 by area and state
            6. Top states  as per their count and their top 2013 RUCC
            7. Insights
          4. Creating a dashboard in Kibana
        13. Summary
      12. 5. Using Beats
        1. Introduction to Beats
        2. How Beats differ from Logstash
        3. How Beats fits into Elastic Stack
        4. An overview of the different types of Beats
          1. Beats by Elastic Team
            1. Packetbeat
            2. Metricbeat
            3. Filebeat
            4. Winlogbeat
            5. Libbeat
          2. Beats by community
            1. Dockbeat
            2. Lmsensorbeat
        5. Exploring Elastic Team Beats
          1. Understanding Filebeat
            1. Filebeat Prospectors Configuration
            2. Processors configuration
              1. Defining a processor
            3. Output Configuration
              1. Elasticsearch Output Configuration
              2. Logstash Output Configuration
              3. Logging Configuration
          2. Understanding Metricbeat
            1. System Module
              1. CPU metricset
              2. Disk I/O metricset
              3. Filesystem metricset
              4. FsStat metricset
              5. Load metricset
              6. Memory metricset
              7. Network metricset
              8. Process Metricset
            2. Installation of Metricbeat
              1. Installation of Metricbeat on Ubuntu 14.04
          3. Understanding Packetbeat
            1. Installation of Packetbeat
              1. Installation of Packetbeat on Ubuntu 14.04
        6. Exploring Community Beats
          1. Understanding Elasticbeat
            1. Installation of Elasticbeat
              1. Installation of Elasticbeat on Ubuntu 14.04
            2. Elasticbeat configuration
        7. Beats in action with Elastic Stack
          1. Exploring Metricbeat with Logstash and Kibana
            1. Step 1-Configuring Metricbeat to send data to Logstash
            2. Step 2-Creating a Logstash configuration file
            3. Step 3-Downloading and loading the sample Beats dashboard
            4. Step 4-Viewing the sample Beats dashboard
          2. Exploring Elasticbeat with Elasticsearch and Kibana
            1. Step 1-Configuring Elasticbeat to send data to Elasticsearch
            2. Step 2-Downloading and loading the Elasticbeat dashboard
            3. Step 3-Viewing the sample Beats dashboard
        8. Summary
      13. 6. Elastic Stack in Action
        1. Understanding problem scenario
          1. Understanding the architecture
        2. Preparing Elastic Stack pipeline
          1. What to capture?
          2. Updated architecture
        3. Configuring Elastic Stack components
          1. Setting up Elasticsearch
          2. Setting up agents/Beats
            1. Packetbeat
            2. Metricbeat
            3. Filebeat
          3. Setting up Logstash
            1. grok for nginxlogs
            2. grok for liferaylogs
            3. grok for openDJ logs.
            4. Config File
          4. Setting up Kibana
        4. Setting up Kibana Dashboards
          1. PacketBeat
          2. MetricBeat
          3. Checking DB (MySQL) Performance
          4. Analyzing CPU usage
          5. Keeping an eye on memory
          6. Checking logs
          7. Finding most visited pages
          8. Visitors' map
          9. Number of visitors in a time frame
          10. Request Types
          11. Error type-log levels
          12. Top referrers
          13. Top agents
        5. Alerting using Logstash e-mail capability
        6. Using a message broker
        7. Summary
      14. 7. Customizing Elastic Stack
        1. Extending Elasticsearch
          1. Elasticsearch development environment
          2. Anatomy of an Elasticsearch Java plugin
          3. Building the plugin
        2. Extending Logstash
          1. Generating a plugin
            1. Anatomy of the plugin
            2. weather.rb file
            3. Plugin logic implementation
              1. Reading data from API end point
              2. Preparing an event
              3. Publish the event
            4. Building and installing a plugin
            5. Testing our plugin
        3. Extending Beats
          1. libbeat framework
          2. Creating a beat
            1. Anatomy of a Beat
            2. Beat configuration
            3. weatherbeat.go file
            4. Implementing beat logic
              1. Adding the Configuration
              2. Reading data from API
              3. Parsing the data
              4. Preparing an event
              5. Publishing the event
            5. Running the beat
        4. Extending Kibana
          1. Setting up Kibana development environment
          2. Generating the plugin
          3. Anatomy of a plugin
        5. Summary
      15. 8. Elasticsearch APIs
        1. The cluster APIs
          1. Cluster health
          2. Cluster State
          3. Cluster stats
          4. Pending tasks
          5. Cluster reroute
          6. Cluster update settings
          7. Node stats
          8. Nodes info API
          9. Task Management API
        2. The cat APIs
        3. Elasticsearch modules
          1. Cluster module
          2. Discovery module
          3. Gateway module
          4. HTTP module
          5. Indices module
          6. Network module
          7. Node client
          8. Plugins module
          9. Scripting
          10. Snapshot/restore module
          11. Thread pools
          12. Transport module
          13. Tribe nodes module
        4. Ingest nodes
        5. Elasticsearch clients
          1. Supported clients
          2. Community contributed clients
        6. Java API
          1. Connecting to a Cluster
          2. Admin tasks
            1. Managing indices
              1. Creating an index
              2. Getting index settings
              3. Updating index settings
              4. Refreshing an index
            2. Managing clusters
              1. Getting cluster tasks
              2. Getting cluster health
          3. Index-level tasks
            1. Managing documents
              1. Indexing a document
              2. Getting a document
              3. Deleting a document
              4. Updating a document
            2. Query DSL and search API
            3. Aggregations
        7. Elasticsearch plugins
          1. Discovery plugins
          2. Ingest plugins
          3. Elasticsearch SQL
        8. Summary
      16. 9. X-Pack: Security and Monitoring
        1. Introduction to X-Pack
        2. Installation of X-Pack
          1. Installing X-Pack in Elasticsearch
          2. Installing X-Pack in Kibana
          3. Installing X-Pack on offline systems
          4. Uninstalling X-Pack
        3. Security
          1. Listing of all users in security
          2. Listing of roles in security
          3. Understanding roles in security
            1. Understanding Cluster Privileges
            2. Understanding Run As privileges
            3. Understanding Indices privileges
          4. Decoding default user roles
            1. kibana_user
            2. superuser
            3. transport_client
          5. Adding a role in security
          6. Updating a role in security
          7. Understanding Field Level Security
          8. Adding a user in security
          9. Updating user details in security
          10. Changing the password of a user in security
          11. Deleting a role in security
          12. Deleting a user in security
        4. Viewing X-Pack information
          1. Enabling and disabling of X-Pack features
        5. Monitoring
          1. Exploring monitoring statistics for Elasticsearch
            1. Discovering the Overview tab
            2. Discovering the Indices tab
            3. Discovering the Nodes tab
          2. Exploring monitoring statistics for Kibana
        6. Understanding Profiler
        7. Summary
      17. 10. X-Pack: Alerting, Graph, and Reporting
        1. Alerting and notification
          1. Working of watcher
            1. Trigger
              1. Schedule trigger
            2. Input
              1. Simple input
              2. Search input
              3. HTTP input
              4. Chain input
            3. Conditions
              1. Always condition
              2. Never condition
              3. Compare condition
              4. Array compare condition
              5. Script condition
            4. Transforms
              1. Search transform
              2. Script transform
              3. Chain transform
            5. Actions
              1. Throttling
        2. Graph
          1. Working of Graph
            1. Graph UI
        3. Reporting
        4. Summary
      18. 11. Best Practices
        1. Why do we require best practices?
        2. Understanding your use case
        3. Managing configuration files
          1. Elasticsearch - elasticsearch.yml
          2. Kibana - kibana.yml
        4. Choosing the right set of hardware
          1. Memory
            1. Java heap size
            2. Swapping memory
          2. Disks
            1. Sizing disk space
          3. I/O
          4. CPU
          5. Network
        5. Searching and indexing performance
          1. Filter cache
          2. Fielddata size
          3. Indexing buffer
        6. Sizing the Elasticsearch cluster
          1. Choosing the right kind of node
            1. Master and data node
              1. Master node
              2. Data node
              3. Ingest node
              4. No master, no data, and no ingest node
          2. Determining the number of nodes
          3. Determining the number of shards
          4. Reducing disk space
        7. Logstash configuration file
          1. Categorizing multiple sources of data
          2. Using conditionals
          3. Using custom grok patterns
          4. Simplifying _grokparsefailure
          5. Mapping of fields
          6. Dynamic templating
          7. Testing configuration
        8. Re-indexing data
          1. Using aliases
        9. Summary
      19. 12. Case Study-Meetup
        1. Understanding meetup scenario
        2. Setting things up
          1. A bit of Meetup API understanding
          2. Setting up Elasticsearch
          3. Preparing Logstash
          4. Setting up Kibana
        3. Analyzing data using Kibana
          1. Filtering Content
          2. Number of Meetups by Country
          3. Top 10 meetup cities in world
          4. Meetups trends by duration
          5. Meetups by RSVP Counts
          6. Number of Groups by country
          7. Number of Groups by join mode
          8. Popular Categories
          9. Popular Topics
          10. Meetup Venue Map
          11. Meetups on Map
          12. Just the number of things
        4. Getting Notified
        5. Summary