mardi 22 mars 2016

Spring batch best practice regarding data between steps

We have an old codebase which i think can be improved a lot by selecting a good batch processing framework.

I've started experimenting with Spring batch and decided (after reading docs, multiple books and forums about it) that the best way to get a feel of it is to actually use it, so i've redeveloped some of our existing apps to Spring Batch.

Here is the simple app i've redeveloped:

  • Read lines from a table and parse lines to a POJO
  • If a give criteria is met, ignore the item from the list
  • Write a file from the parsed list
  • Upload the file to an FTP
  • Mark elements of the parsed list as processed in the db if the FTP upload is successful

Now, what i've done is i've created a single job with 3 steps:

Step1 is: read lines to POJO(reader, jdbc based in this case), exclude items (processor), write to File (writer). Easy. :)

Step2: tasklet to upload to FTP

Step3: now, this is where i'm in trouble. I need to reused the POJO list from Step1.

My understanding is that i have two options:

1) either use the StepExecutionContext, save the List there and a ExecutionContextPromotionListener implementation to pass the list between Step1 and Step2 and Step2 and Step3.

or

2) I use the same reader and processor and use a different writer this time.

I don't really like either approach

1) seems messy and i've read in multiple places that it is not a good practice to put something bigger in this context (my list would hold 5-10000 objects)

2) Seems a waste of resource and again, a bad practice. In this scenario, i could get away with it, but in the case of a more complex reader/processor, it would be a quite bad duplication of work.

What is the best way to do what i'm looking to do? Am i using Spring Batch correctly here?

Aucun commentaire:

Enregistrer un commentaire