Spring Batch lists of lists and how to read them (Part I)

Spring Batch Lists of Lists

Sometimes for a batch job you need to process some input where the final list of records to process depends on another list. For example, if you have a folder containing zip files and you need to process each file contained in each zip, or you have a list of accounts from one database and need to process the customer records for these accounts in another database.

There are a number of ways to do this. Here are three ways I have found useful:

  • Create a custom reader
  • Create an intermediate list
  • Use a decider to loop back and process the next item in the high level list

This blog post covers the first two of these; the decider will be covered in another post.

Custom Reader

You could create a reader to handle the two sets of lists. The reader could retrieve its account codes from another reader and then use a Dao class to look up the customer records for each code.

Spring Batch List of Lists 1

Such a reader class could look like this:

[prism field=Lists_code_1 language=java]

If you’re using XML to configure the job, then you will need to define a reader bean like this:

[prism field=Lists_code_2 language=java]

This example reader class extends the AbstractItemCountingItemStreamItemReader class. This includes logic to store how many records have been read in the step execution context, meaning that you can stop and restart the job and the processing will pick up from the last processed record.

This is fine if the data you are reading is not volatile. However, if the job is stopped and data might have changed between the step initially starting and subsequently restarting, then this approach may cause problems. An alternative could be to create a reader that implements the ItemReader interface, rather than extending the AbstractItemCountingItemStreamItemReader  class, if it is acceptable to process all the records again and not pick up from somewhere in the middle of the list.

Step to create an intermediate list

Another way would be to have one step to convert the initial list into the final list of items to process. Then a second step would read the full list and process the items.

Let’s take the previous example to process the customer records for a list of account codes.

Spring Batch Lists Diagram 2

In this scenario, the job steps would be defined like this:

[prism field=Lists_code_3 language=java]

In the first step the reader supplies the account codes for a processor, which is a simple processor that calls the Dao method to retrieve the customers for an account code. Something like this:

[prism field=Lists_code_4 language=java]

The tempRecordsWriter would store each list in some form of temporary storage, such as a CSV file or temporary database table. As the output of the processor is a List, the input to the writer will be a List of Lists.

I have found it useful to utilise a writer that takes a list of lists and passes each of the nested lists to another writer to perform the actual writing:

[prism field=Lists_code_5 language=java]

If you’re using XML to configure the job, the writer definition would look like this:

[prism field=Lists_code_6 language=java]

The tempCustomerWriter bean could be a FlatFileItemWriter, or another of your choosing.

The tempRecordsReader would be a simple reader to retrieve the customer records for use in the subsequent processing.

This option is ideal if the list is volatile and the job may have to be stopped and restarted.

Conclusion

Of these two ways to achieve the processing of lists of lists, for me the appropriate solution is determined by whether the job is required to be restartable and how volatile the data is.

I generally go for the intermediate list unless there is a reason not to. Using this method with a flat file as the storage medium also results in a record of what was actually processed in the job, which is useful for audit purposes.

In the next post I’ll describe how to process lists of lists using a decider to control the flow of processing.

For a description of the Spring Batch framework please take a look here.

Spring Batch List of Lists and how to read them
By Jeremy Yearron
30 October 2017
JavaThe Good Systems Blog

Share this post

Catt to action

Amet aliquam id diam maecenas ultricies mi eget mauris

Lorem ipsum dolor sit amet, consectetur elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.