I'm looking for some best practices on how to implement the following pattern using (micro) services and a service bus. In this particular case Service Fabric services and an Azure service bus instance, but from a pattern point of view that might not even be that important.
Suppose a scenario in which we get a work package with a number of files in it. For processing, we want each individual file to be processed in parallel so that we can easily scale this operation onto a number of services. Once all processing completes, we can continue our business process.
So it's a fan-out, following by a fan-in to gather all results and continue. For example sake, let's say that we have a ZIP file, unzip the file and have each file processed and once all are done we can continue.
The fan-out bit is easy. Unzip the file, for n files post n messages onto a service bus queue and have a number of services handle those in parallel. But now how do we know that these services have all completed their work?
A number of options I'm considering:
- Next to sending a service bus message for each file, we also store the files in a table of sorts, along with the name of the originating ZIP file. Once a worker is done processing, it will remove that file from the table again and check if that was the last one. When it was, we can post a message to indicate the entire ZIP was now processed.
- Similar to 1., but instead we have the worker reply that it's done and the ZIP processing service will then check if there is any work left. Little bit cleaner as the responsibility for that table is now clearly with the ZIP processing service.
- Have the ZIP processing service actively wait for all the reply messages in separate threads, but just typing this already makes my head hurt a bit.
- Introduce a specific orchestrator service which takes the n messages and takes care of the fan-out / fan-in pattern. This would still require solution 2 as well, but it's now located in a separate service so we don't have any of this logic (+ storage) in the ZIP processing service itself.
I looked into how service bus might already have a feature of some sort to support this pattern, but could not find anything suitable. Durable functions seems to support a scenario like this, but we're not using functions within this project and I'd rather not start doing so just to implement this one pattern.
We're surely not the first ones to implement such a thing, so I really was hoping I could find some real world advise as to what works and what should be avoided at all cost.
Aucun commentaire:
Enregistrer un commentaire