XML
There are two commonly used XML parsers in market: DOM and SAX. DOM parser loads the entire file into memory which makes it not useful for batch processing due to performance implications. Spring Batch uses StAX parser, which is similar to the event-based SAX parser but has the advantage of allowing for the ability to parse sections of document independently making its performance much better compared to SAX.
How the parsing of XML works
The XML is first divided into fragments and then these fragments are supplied to the Spring OXM which does the necessary parsing of the XML file as shown in Figure 23-12 below.
Figure 23-12. How parsing of an XML works
Spring Batch uses Object/XML Mapping (OXM) to bind fragments to objects. Spring OXM is used because it provides uniform abstraction for the most popular OXM technologies.
StaxEventItemReader
Spring Batch provides org.springframework.batch.item.xml.StaxEventItemReader to parse XML input file. To use this ItemReader you define a fragment root element name, which identifies the root element of each fragment as shown in Figure 23-9 above. Another input parameter which needs to be supplied is the org.springframework.oxm.Unmarshaller implementation which will be used to convert the input XML to appropriate business object of your application. Spring provides unmarshaller implementations that use Castor, JAXB, JiBX, XMLBeans, and XStream in their oxm package as shown in Figure 23-9 above. Configuring a StaxEventItemReader along with appropriate marshaller is as shown in Listing 23-9 below.
Listing 23-9. Configuring StaxEventItemReader in the Spring configuration file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
… <bean id="sampleFile" class="org.springframework.core.io.FileSystemResource" scope="step"> … </bean> <bean id="customerFileReader" class="org.springframework.batch.item.xml.StaxEventItemReader"> <property name="fragmentRootElementName" value="sample" /> <property name="resource" ref="sampleFile" /> <property name="unmarshaller" ref="sampleMarshaller" /> </bean> … |
StaxEventItemWriter
StaxEventItemWriter, the Streaming API for XML (StAX) implementation allows Spring Batch to write fragments of XML as each chunk is processed. It generates the XML a chunk at a time and writes it to the file after the local transaction has been committed. By doing this it prevents rollback issues if there is an error while writing to the file.
The configuration of the StaxEventItemReader consists of the file to read from, a root element name of the XML, and an unmarshaller which will convert the XML input into an object as shown in Listing 23-10 below.
Listing 23-10. Configuring StaxEventItemWriter in the Spring configuration file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
... <bean id="outputFile" scope="step"> … </bean> <beans:bean id="xmlOutputWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter"> <beans:property name="marshaller" ref="sampleMarshaller" /> <beans:property name="rootTagName" value="samples" /> </beans:bean> <beans:bean id="sampleMarshaller" class="org.springframework.oxm.xstream.XStreamMarshaller"> … </beans:bean> … |
Page Visitors: 25939
Tomcy John
Latest posts by Tomcy John (see all)
- A Guide to Continuous Improvement for Architects - February 2, 2023
- Cloud-first Architecture Strategy - January 26, 2023
- Architecture Strategy and how to create One - January 24, 2023
Thanks for sharing the information but what’s the difference between CursorItemReader’s setFetchSize() and PagingItemReader’s setPageSize()? isn’t it the same?
Please provide an example for StoredProcedureItem Reader – which returns cursor and how to process cursor in processor – I am new to Spring Batch as part of my work I need to invoke Oracle Stored proc which takes one param as input and returns result set which I need to process in Spring Batch Processor.
Thanks
Laxmi