Spring Book – Chapter 21 – Spring Batch

Spring Batch is the first Java based lightweight, comprehensive framework for batch processing. Being built on top of the Spring Framework, it gives all the advantages of productive, POJO-based development approach.

  • Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of robust batch applications vital for the daily operations of enterprise systems. Spring Batch builds upon the productivity, POJO-based development approach, and general ease of use capabilities people have come to know from the Spring Framework, while making it easy for developers to access and leverage more advanced enterprise services when necessary.
  • – Spring batch documentation (http://static.springsource.org/spring-batch/)

In this Chapter, we will cover Spring Batch in a comprehensive manner and aims at letting you use it in your enterprise application with less effort. The initial sections will introduce you to the core concepts on which Spring Batch is build and then later on delve deep into the various concepts in details along with appropriate code snippets.

Batch and Offline Processing

You would have heard these terms batch and offline processing for quite some time now and every time you have such requirement in your application you sit in front of the computer and start something fresh and code from scratch to realize this. With Spring Batch you now have a framework using which you can do these in a much better and cleaner manner. Before going through Spring Batch in detail, I would like to spend sometime explaining the usual business case which Spring Batch tries to achieve. Primary among these are batch and offline processing.

Batch Processing

Batch applications needs to process high volume of business critical transactional data in an efficient manner. A typical batch program does the following:

  • Reads a large amount of data from database, file or queue as the case may be.
  • Processes the data according to the business requirement in an efficient manner.
  • Writes back the modified/processed data to database, file or queue as the case may be.

Batch is a group of similar or identical items and the pseudo code for this can be shown in Figure 23-1 below.

 Figure 23-1. Pseudocode for a typical Batch

Figure 23-1. Pseudocode for a typical Batch

Offline Processing

In most modern day applications there requires capability processing client request in offline manner. Offline processing differs from online/real-time processing with respect to the following aspects as outlined below:

  • Processing of processes which are long-running and which occurs beyond usual office hours
  • Non-interactive in nature and often requires appropriate logic capable of handling errors and taking necessary actions like restart in some cases and so forth.
  • Processing of processes which have large amount of data not fitting into a single transaction.

Some of the common examples of batch and offline processing are as given below, so that you can understand the use of Spring Batch and appreciate what it delivers out-of-box for you.

  • Large scale output jobs which need to run on a timely manner. For example; sales report spanning whole month or even whole year.
  • Import/export handling of data. For example; ETL (Extract-Transform-Load) jobs, data synchronization jobs etc.
  • Various close of business jobs. For example; sales report spanning a day, business level reporting etc.
  • The lack of standard, reusable batch architecture has resulted in the proliferation of many one-off, in-house solutions developed within client enterprise IT functions.
  • – Spring batch documentation (http://static.springsource.org/spring-batch/)

Why a framework?

So why would you need a framework for implementing batch? Why can’t we use a “for loop” for doing such batch jobs? We need to have a framework which addresses not only running a bunch of code in a loop fashion but also to have other features/capabilities/business scenarios as listed below (as detailed in Spring Batch documentation):

  • Commit batch processes periodically; Capability of committing the processed data at times due to various business reasons.
  • Staged, enterprise message-driven processing
  • Concurrent batch processing; parallel processing of a job
  • Massively parallel batch processing
  • Sequential processing of dependent steps with extensions to workflow-driven batches
  • Manual or scheduled restart after failure
  • Whole batch transaction: for cases with a small batch size or existing stored procedure s/scripts
  • Partial processing: skip records e.g. on rollback

Page Visitors: 25940

The following two tabs change content below.
Tomcy John

Tomcy John

Blogger & Author at javacodebook
He is an Enterprise Java Specialist holding a degree in Engineering (B-Tech) with over 10 years of experience in several industries. He's currently working as Principal Architect at Emirates Group IT since 2005. Prior to this he has worked with Oracle Corporation and Ernst & Young. His main specialization is on various web technologies and acts as chief mentor and Architect to facilitate incorporating Spring as Corporate Standard in the organization.
Tomcy John

Latest posts by Tomcy John (see all)

2 thoughts on “Spring Book – Chapter 21 – Spring Batch

  1. Thanks for sharing the information but what’s the difference between CursorItemReader’s setFetchSize() and PagingItemReader’s setPageSize()? isn’t it the same?

  2. Please provide an example for StoredProcedureItem Reader – which returns cursor and how to process cursor in processor – I am new to Spring Batch as part of my work I need to invoke Oracle Stored proc which takes one param as input and returns result set which I need to process in Spring Batch Processor.

    Thanks
    Laxmi

Leave a Reply

Your email address will not be published. Required fields are marked *