Scaling and Parallel Processing
In any modern application, executing a job in stipulated time with good capability of scaling are very important factors in overall functioning of an application. Spring Batch provides various options by which to address scaling and parallel processing in an application as summarized below:
- Single process, making a step multi-threaded (Multi-Threaded Step)
- Single process, by doing parallel steps (Parallel Step)
- Multiple process, making a step remote chunking (Remote Chunking)
- Single or multiple process, making a step partitioned (Partitioning)
We will now see these options in some detail (not in exhaustive manner) in the following sections.
Multi-Threaded Step
Pictorially a multi-threaded step can be as shown in Figure 23-24 below.
Figure 23-24. Multi-Threaded Step
Just by adding a TaskExecutor to the “tasklet” in the Spring configuration file, you can make a step multi-threaded as shown in Listing 23-30 below.
Listing 23-30. Making a step multi-threaded using TaskExecutor in the Spring configuration file
1 2 3 4 5 6 7 8 9 |
… <step id="step2"> <tasklet task-executor="taskExecutor" throttle-limit="20" … /> </step> … |
Parallel Step
Pictorially a multi-threaded step can be as shown in Figure 23-25 below.
Figure 23-25. Parallel Step
Listing 23-31 below shows how we can achieve parallel step by mere configuration in the Spring configuration file.
Listing 23-31. Parallel Step configuration of job in Spring configuration file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
… <job id="sampleJob"> <split id="split1" task-executor="taskExecutor" next="step4"> <flow> <step id="step1" next="step2"/> <step id="step2"/> </flow> <flow> <step id="step3"/> </flow> </split> <step id="step4"/> </job> <bean id="taskExecutor" class="…"/> … |
Remote Chunking
In Remote Chunking the Step processing is split across multiple processes, communicating with each other through some middleware. This is one of the approaches to parallelization which allows spreading processing across multiple JVM’s. Figure 23-26 below shows Chunk Remoting in action.
Figure 23-26. Remote Chunking
Reader reads chunks and then sends for remote processing. It requires a durable middleware similar to JMS for communication between the master and the slaves. In the remote slave ItemProcessor is configured as a message driven POJO which processes the message and then sends the updated item back to the master for writing.
The Master component is a single process, and the Slaves are multiple remote processes. Spring Batch has a sister project Spring Batch Admin, which provides implementations of various patterns using Spring Integration. These are implemented in a module called Spring Batch Integration.
Partitioning
This is another approach of parallelization in which you use a master/slave configuration, but in this approach you don’t need a durable method of communication and the master serves only as a controller for a collection of slave steps as shown in Figure 23-27 below.
Figure 23-27. Partitioning
Listing 23-32 below shows how we can achieve this approach of parallelization by mere step configuration in the Spring configuration file.
Listing 23-32. Parallelization using partitioning of step in Spring configuration file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
… <step id="step1.master"> <partition step="step1" partitioner="partitioner"> <handler grid-size="10" task-executor="taskExecutor"/> </partition> </step> <bean id="partitioner"> <property name="dataSource" ref="…"/> <property name="table" value="…"/> <property name="column" value="…"/> </bean> <bean id="taskExecutor"/> … |
Benefits of Using Spring Batch
The benefits of Spring Batch can be summarized as follows:
- Spring Batch is completely based on Spring Framework; you will automatically get all the benefits of Spring Framework like dependency injection and bean management based on simple POJO’s.
- Developers well-versed with Spring Framework will become productive in Spring Batch in very less time and become productive.
- The technical difficulties surrounding batch based application is solved, making developers concentrate on the actual business requirement rather than creating the batch framework itself.
- Leveraging the additional functionality of Spring Integration using Spring Batch Integration project, you can further increase the scalability of more distributed processes.
Summary
Spring Batch is a new implementation of some very old ideas. It brings one of the oldest programming models in software industry into the mainstream through the use of the Spring Framework. Being built over the Spring Framework, it enables a batch project to enjoy the same clean architecture and lightweight programming model, supported by industry-proven patterns, operations, templates, callbacks and other idioms. Spring Batch is an exciting initiative that offers the potential of standardizing Batch Architectures.
In this Chapter, we started off with various concepts in a typical batch project and delved into these into some detail. We haven’t covered each concept in complete detail as this is out-of-scope of this book. This chapter intended to introduce you to Spring Batch as a whole with the various functionalities which you could do using this. It has covered various uses of this framework but didn’t go into detail.
After reading this Chapter, you would have a clear idea of Spring Batch as a whole and I am sure you will be able to convert typical business use cases in your application to appropriate use case in Spring Batch.
Page Visitors: 25402


Tomcy John


Latest posts by Tomcy John (see all)
- A Guide to Continuous Improvement for Architects - February 2, 2023
- Cloud-first Architecture Strategy - January 26, 2023
- Architecture Strategy and how to create One - January 24, 2023
Thanks for sharing the information but what’s the difference between CursorItemReader’s setFetchSize() and PagingItemReader’s setPageSize()? isn’t it the same?
Please provide an example for StoredProcedureItem Reader – which returns cursor and how to process cursor in processor – I am new to Spring Batch as part of my work I need to invoke Oracle Stored proc which takes one param as input and returns result set which I need to process in Spring Batch Processor.
Thanks
Laxmi