You are previewing Pro Spring Batch.
O'Reilly logo
Pro Spring Batch

Book Description

Since its release, Spring Framework has transformed virtually every aspect of Java development including web applications, security, aspect-oriented programming, persistence, and messaging. Spring Batch, one of its newer additions, now brings the same familiar Spring idioms to batch processing. Spring Batch addresses the needs of any batch process, from the complex calculations performed in the biggest financial institutions to simple data migrations that occur with many software development projects.

Pro Spring Batch is intended to answer three questions:

  • What? What is batch processing? What does it entail? What makes it different from the other applications we are developing? What are the challenges inherent in the development of a batch process?

  • Why? Why do batch processing? Why can't we just process things as we get them? Why do we do batch processing differently than the web applications that we currently work on?

  • How? How to implement a robust, scalable, distributed batch processing system using open-source frameworks

Pro Spring Batch gives concrete examples of how each piece of functionality is used and why it would be used in a real-world application. This includes providing tips that the "school of hard knocks" has taught author Michael Minella during his experience with Spring Batch. Pro Spring Batch includes examples of I/O options that are not mentioned in the official user's guide, as well as performance tips on things like how to limit the impact of maintaining the state of your jobs.

The author also walks you through, from end to end, the design and implementation of a batch process based upon a theoretical real-world example. This includes basic project setup, implementation, testing, tuning and scaling for large volumes.

What you'll learn

  • Batch concepts and how they relate to the Spring Batch framework

  • How to use declarative I/O using the Spring Batch readers/writers

  • Data integrity techniques used by Spring Batch, including transactions and job state/restartability

  • How to scale batch jobs via distributed batch processing

  • How to handle testing batch processes (Unit and functional)

Who this book is for

  • Java developers with Spring experience.

  • Java Architects designing batch solutions

More specifically, this book is intended for those who have a solid foundation in the core Java platform. Batch processing covers a wide spectrum of topics, not all of which are covered in detail in this book. Concepts in Java which the reader should be comfortable with include file I/O, JDBC, and transactions. Given that Spring Batch is a framework built upon the open-source IoC container Spring, which will not be covered in this book, it is expected that the reader will be familiar with its concepts and conventions. With that in mind, the reader is not expected to have any prior exposure to the Spring Batch framework. All concepts related to it will be explained in detail, with working examples.

Table of Contents

  1. Copyright
  2. About the Author
  3. About the Technical Reviewer
  4. Acknowledgments
  5. 1. Batch and Spring
    1. 1.1. A History of Batch Processing
    2. 1.2. Batch Challenges
    3. 1.3. Why Do Batch Processing in Java?
    4. 1.4. Other Uses for Spring Batch
    5. 1.5. The Spring Batch Framework
      1. 1.5.1. Defining Jobs with Spring
      2. 1.5.2. Managing Jobs
      3. 1.5.3. Local and Remote Parallelization
      4. 1.5.4. Standardizing I/O
      5. 1.5.5. The Spring Batch Admin Project
      6. 1.5.6. And All the Features of Spring
    6. 1.6. How This Book Works
    7. 1.7. Summary
  6. 2. Spring Batch 101
    1. 2.1. The Architecture of Batch
      1. 2.1.1. Examining Jobs and Steps
      2. 2.1.2. Job Execution
      3. 2.1.3. Parallelization
        1. 2.1.3.1. Multithreaded Steps
        2. 2.1.3.2. Parallel Steps
        3. 2.1.3.3. Remote Chunking
        4. 2.1.3.4. Partitioning
      4. 2.1.4. Batch Administration
      5. 2.1.5. Documentation
    2. 2.2. Project Setup
      1. 2.2.1. Obtaining Spring Batch
        1. 2.2.1.1. Using the SpringSource Tool Suite
        2. 2.2.1.2. Downloading the Zip Distribution
        3. 2.2.1.3. Checking Out from Git
        4. 2.2.1.4. Configuring Maven
    3. 2.3. It's the Law: Hello, World!
    4. 2.4. Running Your Job
    5. 2.5. Exploring the JobRepository
      1. 2.5.1. Job Repository Configuration
      2. 2.5.2. The Job Repository Tables
        1. 2.5.2.1. BATCH_JOB_INSTANCE
        2. 2.5.2.2. BATCH_JOB_PARAMS
        3. 2.5.2.3. BATCH_JOB_EXECUTION and BATCH_STEP_EXECUTION
        4. 2.5.2.4. Job and Step Execution Context Tables
    6. 2.6. Summary
  7. 3. Sample Job
    1. 3.1. Understanding Agile Development
      1. 3.1.1. Capturing Requirements with User Stories
      2. 3.1.2. Capturing Design with Test-Driven Development
      3. 3.1.3. Using a Source-Control System
      4. 3.1.4. Working with a True Development Environment
    2. 3.2. Understanding the Requirements of the Statement Job
    3. 3.3. Designing a Batch Job
      1. 3.3.1. Job Description
        1. 3.3.1.1. Importing Customer Transaction Data
        2. 3.3.1.2. Retrieving Stock Closing Prices
        3. 3.3.1.3. Importing Stock Prices into Database
        4. 3.3.1.4. Calculating Transaction Fee Tiers
        5. 3.3.1.5. Calculating Transaction Fees
        6. 3.3.1.6. Generating Customer Monthly Statements
      2. 3.3.2. Understanding the Data Model
    4. 3.4. Summary
  8. 4. Understanding Jobs and Steps
    1. 4.1. Introducing a Job
      1. 4.1.1. Tracing a Job's Lifecycle
    2. 4.2. Configuring a Job
      1. 4.2.1. Basic Job Configuration
      2. 4.2.2. Job Inheritance
      3. 4.2.3. Job Parameters
        1. 4.2.3.1. Validating Job Parameters
        2. 4.2.3.2. Incrementing Job Parameters
      4. 4.2.4. Working with Job Listeners
      5. 4.2.5. ExecutionContext
      6. 4.2.6. Manipulating the ExecutionContext
        1. 4.2.6.1. ExecutionContext Persistence
    3. 4.3. Working with Steps
      1. 4.3.1. Chunk vs. Item Processing
      2. 4.3.2. Step Configuration
        1. 4.3.2.1. Basic Step
      3. 4.3.3. Understanding the Other Types of Tasklets
        1. 4.3.3.1. CallableTaskletAdapter
        2. 4.3.3.2. MethodInvokingTaskletAdapter
        3. 4.3.3.3. SystemCommandTasklet
        4. 4.3.3.4. Tasklet Step
        5. 4.3.3.5. Step Inheritance
        6. 4.3.3.6. Chunk-Size Configuration
        7. 4.3.3.7. Step Listeners
      4. 4.3.4. Step Flow
        1. 4.3.4.1. Conditional Logic
        2. 4.3.4.2. Ending a Job
        3. 4.3.4.3. Externalizing Flows
        4. 4.3.4.4. Parallelization of Flows
      5. 4.3.5. Item Error Handling
        1. 4.3.5.1. Item Retry
        2. 4.3.5.2. Item Skip
    4. 4.4. Summary
  9. 5. Job Repository and Metadata
    1. 5.1. Configuring the Job Repository
      1. 5.1.1. Using an In-Memory Job Repository
      2. 5.1.2. Database
        1. 5.1.2.1. Database Schema Configuration
        2. 5.1.2.2. Transaction Configuration
    2. 5.2. Using Job Metadata
      1. 5.2.1. The JobExplorer
      2. 5.2.2. The JobOperator
    3. 5.3. Summary
  10. 6. Running a Job
    1. 6.1. Starting a Job
      1. 6.1.1. Job Execution
      2. 6.1.2. Spring Batch Job Runners
        1. 6.1.2.1. CommandLineJobRunner
        2. 6.1.2.2. JobRegistryBackgroundJobRunner
      3. 6.1.3. Third-Party Integration
        1. 6.1.3.1. Scheduling with Quartz
        2. 6.1.3.2. Running in a Container
        3. 6.1.3.3. Launching with Spring Batch Admin
    2. 6.2. Stopping a Job
      1. 6.2.1. The Natural End
      2. 6.2.2. Programmatic Ending
        1. 6.2.2.1. Using the <stop> Tag
        2. 6.2.2.2. Stopping with StepExecution
      3. 6.2.3. External Stoppage
        1. 6.2.3.1. Stopping via Spring Batch Admin
        2. 6.2.3.2. Stopping Using CommandLineJobRunner
      4. 6.2.4. Error Handling
        1. 6.2.4.1. Stopping the Job
    3. 6.3. Controlling Restart
      1. 6.3.1. Preventing a Job from Being Rerun
      2. 6.3.2. Configuring the Number of Restarts
      3. 6.3.3. Rerunning a Complete Step
    4. 6.4. Summary
  11. 7. Readers
    1. 7.1. The ItemReader Interface
    2. 7.2. File Input
      1. 7.2.1. Flat Files
        1. 7.2.1.1. Fixed-Width Files
        2. 7.2.1.2. Delimited Files
        3. 7.2.1.3. Custom Record Parsing
        4. 7.2.1.4. Multiple Record Formats
        5. 7.2.1.5. Multiline Records
        6. 7.2.1.6. Multiple Sources
      2. 7.2.2. XML
    3. 7.3. Database Input
      1. 7.3.1. JDBC
        1. 7.3.1.1. JDBC Cursor Processing
        2. 7.3.1.2. JDBC Paged Processing
      2. 7.3.2. Hibernate
        1. 7.3.2.1. Cursor processing with Hibernate
        2. 7.3.2.2. Paged Database Access with Hibernate
      3. 7.3.3. JPA
    4. 7.4. Existing Services
    5. 7.5. Custom Input
    6. 7.6. Error Handling
      1. 7.6.1. Skipping Records
      2. 7.6.2. Logging Invalid Records
      3. 7.6.3. Dealing with No Input
    7. 7.7. Summary
  12. 8. Item Processors
    1. 8.1. Introduction to ItemProcessors
    2. 8.2. Using Spring Batch's ItemProcessors
      1. 8.2.1. ValidatingItemProcessor
        1. 8.2.1.1. Input Validation
        2. 8.2.1.2. Subclassing the ValidatingItemProcessor
      2. 8.2.2. ItemProcessorAdapter
      3. 8.2.3. CompositeItemProcessor
    3. 8.3. Writing Your Own ItemProcessor
      1. 8.3.1. Filtering Items
    4. 8.4. Summary
  13. 9. Item Writers
    1. 9.1. Introduction to ItemWriters
    2. 9.2. File-Based ItemWriters
      1. 9.2.1. FlatFileItemWriter
        1. 9.2.1.1. Formatted Text Files
        2. 9.2.1.2. Delimited Files
        3. 9.2.1.3. File Creation Options
      2. 9.2.2. StaxEventItemWriter
    3. 9.3. Database-Based ItemWriters
      1. 9.3.1. JdbcBatchItemWriter
      2. 9.3.2. HibernateItemWriter
      3. 9.3.3. JpaItemWriter
    4. 9.4. Alternative Output Destination ItemWriters
      1. 9.4.1. ItemWriterAdapter
      2. 9.4.2. PropertyExtractingDelegatingItemWriter
      3. 9.4.3. JmsItemWriter
      4. 9.4.4. SimpleMailMessageItemWriter
    5. 9.5. Multipart ItemWriters
      1. 9.5.1. MultiResourceItemWriter
        1. 9.5.1.1. Header and Footer XML Fragments
        2. 9.5.1.2. Header and Footer Records in a Flat File
      2. 9.5.2. CompositeItemWriter
      3. 9.5.3. ClassifierCompositeItemWriter
        1. 9.5.3.1. The ItemStream Interface
    6. 9.6. Summary
  14. 10. Sample Application
    1. 10.1. Reviewing the Statement Job
    2. 10.2. Setting Up a New Project
    3. 10.3. Importing Customer and Transaction Data
      1. 10.3.1. Creating the Customer Transaction Reader
      2. 10.3.2. Looking Ip Ids
      3. 10.3.3. Writing the Customer and Transaction Data
    4. 10.4. Downloading Current Stock Prices
      1. 10.4.1. Reading the Tickers
      2. 10.4.2. Writing the Stock File
    5. 10.5. Importing Current Stock Prices
      1. 10.5.1. Reading the Stock Price File
      2. 10.5.2. Writing Stock Prices to the Database
    6. 10.6. Calculating Pricing Tiers
      1. 10.6.1. Reading How Many Transactions the Customer Had
      2. 10.6.2. Calculating the Pricing Tier
      3. 10.6.3. Updating the Database with the Calculated Tier
    7. 10.7. Calculating Transaction Fees
      1. 10.7.1. Reading the Transactions
        1. 10.7.1.1. Calculating Transaction Prices
      2. 10.7.2. Saving Transaction Fees to the Database
    8. 10.8. Generating Monthly Statement
      1. 10.8.1. Reading the Statement Data
      2. 10.8.2. Writing the Statements
    9. 10.9. Summary
  15. 11. Scaling and Tuning
    1. 11.1. Profiling Your Batch Process
      1. 11.1.1. A Tour of VisualVM
      2. 11.1.2. Profiling Spring Batch Applications
        1. 11.1.2.1. CPU Profiling
        2. 11.1.2.2. Memory Profiling
    2. 11.2. Scaling a Job
      1. 11.2.1. Multithreaded Steps
      2. 11.2.2. Parallel Steps
        1. 11.2.2.1. Preloading Data for Processing
        2. 11.2.2.2. Loading Orders Into the Database
        3. 11.2.2.3. Configuring the Parallel Steps
        4. 11.2.2.4. Building the Picklists
      3. 11.2.3. Remote Chunking
      4. 11.2.4. Partitioning
    3. 11.3. Summary
  16. 12. Testing Batch Processes
    1. 12.1. Unit Tests with JUnit and Mockito
      1. 12.1.1. JUnit
        1. 12.1.1.1. JUnit Lifecycle
      2. 12.1.2. Mock Objects
      3. 12.1.3. Mockito
    2. 12.2. Integration Tests with Spring Classes
      1. 12.2.1. General Integration Testing with Spring
        1. 12.2.1.1. Configuring the Testing Environment
        2. 12.2.1.2. Writing an Integration Test
      2. 12.2.2. Testing Spring Batch
        1. 12.2.2.1. Testing Step-Scoped Beans
        2. 12.2.2.2. Testing a Step
        3. 12.2.2.3. Testing a Job
    3. 12.3. Summary