You are previewing Instant Parallel Processing with Gearman.
O'Reilly logo
Instant Parallel Processing with Gearman

Book Description

Learn how to use Gearman to build scalable distributed applications

  • Learn something new in an Instant! A short, fast, focused guide delivering immediate results

  • Build a cluster of managers, workers, and clients using Gearman to scale your application

  • Understand how to reduce single-points-of-failure in your distributed applications

  • Build clients and workers to process data in the background and provide real-time updates to your frontend

In Detail

Many of today’s applications need to be able to process large volumes of data, and vertical scaling has its limits both in terms of prohibitive cost and hardware limits. Gearman is an open source job manager that is well-suited to building horizontally scalable systems, from map-reduce algorithms to simple data processors capable of handling massive amounts of information.

Instant Gearman is a practical, hands-on guide to getting started with building an open source job management server system that is built to grow. Learn the basics of building a distributed application that spans multiple components and learn how Gearman fits into building an application that scales from one to hundreds of components that interact to process data. With Gearman, you can build software that scales horizontally as your need for computation increases.

Instant Gearman has in-depth examples and a step-by-step approach to building distributed systems, helping you to build systems that are scalable and modular in their approach to processing data.

Once you are comfortable with building simple workers and clients, learn how to build a cluster of managers and see how to reduce single-point-of-failure in your architecture. Next, build a simple map-reduce application using Gearman and scale it up from a single instance to multiple parallel processing components.

Table of Contents

  1. Instant Parallel Processing with Gearman
    1. Instant Parallel Processing with Gearman
    2. Credits
    3. About the Author
    4. About the Reviewer
      1. Support files, eBooks, discount offers and more
      1. Why Subscribe?
      2. Free Access for Packt account holders
    7. 1. Instant Parallel Processing with Gearman
      1. So, what is Gearman?
        1. Distinguishing features of Gearman
        2. Overview of components
        3. The conversation between the actors
          1. Usecase – image processing
      2. Quick Start – building your first components
        1. Step 1 – running a server
        2. Step 2 – downloading a client library
          1. Your first client
          2. Writing the client
            1. Running your newly created client
            2. Verifying that it was submitted
        3. Step 3 – our first worker
          1. Running your newly created worker
        4. Step 4 – varying priorities
        5. Step 5 – background tasks
        6. And that's it
      3. Top 5 features you need to know about
        1. Job handling
          1. Completed versus successful
          2. Unique identifiers and coalescing
          3. The client
          4. The worker
          5. Making it work
        2. Scaling your system
          1. Running multiple managers
          2. Writing a client that supports multiple managers
          3. Submitting jobs to multiple managers
          4. Handling a manager being terminated
          5. Processing jobs from multiple managers
        3. MapReduce
          1. The shared library
          2. A simple program
          3. The client
          4. The worker
          5. Parallelizing the pipeline
          6. Scaling this solution
          7. Processing a large data file
        4. Providing job updates
          1. Data updates
            1. The worker
            2. The client
          2. Status updates
            1. The worker
            2. The client
          3. Building a processing pipeline
          4. Using multiple languages
        5. Persistence engines
          1. Why is this important?
          2. Persistent versus non-persistent
          3. How safe is safe?
      4. People and places you should get to know
        1. Client libraries
        2. Community