Essentially what I'm trying to design is "batch processing" using queues.
I have two Spring projects, one that will act as a message Producer and do the following:
- Take a CSV file and for each row in the CSV file:
A. construct a custom JSON payload and publish the message asynchronously to a queue called "outbound"
B. The number of messages could be thousands, potentially tens of thousands published very quickly with a fast CSV parser
The second Spring project can be considered the "processor" and will be scaled up as a pod using Kubernetes, based on the queue size.
- The pool of these scaled up Consumers will read from "inbound", process the payload, and publish the output to a queue called "outbound"
The aforementioned Producer will be also listening on "outbound" for a response, and will then take that response data and store it in a database.
The Consumers in this case don't really care where the message is coming from. It just processes the payload and sends a response to the queue, but the Producers do care about the response. The Producers need to take the response and store it in a specific database table depending on which CSV is being used.
How can the Producers, once receiving the reply, know which database to insert the response data into?
I'm thinking that when the Producer sends the message, it can set the correlation ID to a UUID that can be cross-referenced to the name of the database that the reply should be written to.
Then, when a message is received by the Producer, it can use the correlation ID, along with the reply message data, to create a JDBC connection and insert the data.
Below is a diagram at a high level of what I'm trying to achieve
Aucun commentaire:
Enregistrer un commentaire