design-patterns: Spring batch best practice regarding data between steps

mardi 22 mars 2016

Spring batch best practice regarding data between steps

We have an old codebase which i think can be improved a lot by selecting a good batch processing framework.

I've started experimenting with Spring batch and decided (after reading docs, multiple books and forums about it) that the best way to get a feel of it is to actually use it, so i've redeveloped some of our existing apps to Spring Batch.

Here is the simple app i've redeveloped:

Read lines from a table and parse lines to a POJO
If a give criteria is met, ignore the item from the list
Write a file from the parsed list
Upload the file to an FTP
Mark elements of the parsed list as processed in the db if the FTP upload is successful

Now, what i've done is i've created a single job with 3 steps:

Step1 is: read lines to POJO(reader, jdbc based in this case), exclude items (processor), write to File (writer). Easy. :)

Step2: tasklet to upload to FTP

Step3: now, this is where i'm in trouble. I need to reused the POJO list from Step1.

My understanding is that i have two options:

1) either use the StepExecutionContext, save the List there and a ExecutionContextPromotionListener implementation to pass the list between Step1 and Step2 and Step2 and Step3.

2) I use the same reader and processor and use a different writer this time.

I don't really like either approach

1) seems messy and i've read in multiple places that it is not a good practice to put something bigger in this context (my list would hold 5-10000 objects)

2) Seems a waste of resource and again, a bad practice. In this scenario, i could get away with it, but in the case of a more complex reader/processor, it would be a quite bad duplication of work.

What is the best way to do what i'm looking to do? Am i using Spring Batch correctly here?

design-patterns

mardi 22 mars 2016

Spring batch best practice regarding data between steps

Aucun commentaire:

Enregistrer un commentaire