jeudi 17 juin 2021

How to keep order when consuming async messages (such as SQS or any other messaging service)

I've encountered this problem a few times and now I wonder what the industry best practice is, the context is, we have a data store which aggregates pieces of information taken from multiple micro-services, the way the data comes to us is through messages broadcasted by every source when there is a change

The problem is how to guarantee that our data will be eventually consistent and that the updates were applied in the order they were meant to be received. For example, Let's say we have an entity User

User {
   display_name : String,
   email: String,
   bio: String
}

And we are listening changes on those users to keep "display_name" updated in our data store, the messages come in a format such as

{
     event: "UserCreated",
     id: 1000,
     display_name: "MyNewUser"
}

{
     event: "UserChanged",
     id: 1000,
     display_name: "MyNewUser2"
}

There is a scenario where "UserChanged" reaches our listeners before "UserCreated" therefore our code won't be able to find user with id 1000 and fail both transactions. This is where a mechanism to sort those two is desired, we have considered:

  • Timestamps: The problem with timestamps is that although we know the last time we read an update we don't know how many events happened between the last event seen and the one we are currently processing
  • Sequence numbers: This is slightly better but if a sequence is lost then we won't update our storage unless we relax the rules a little bit, we could say that after some time is a sequence hasn't been seen proceed with the rest of operations

If anyone knows common design patterns that tackle this sort of issue would be great to know, also open to suggestions on perhaps data modeling, etc. Bottomline, I'm pretty sure this is a common software problem that has been solved many times before

Thanks a lot for the help!

Aucun commentaire:

Enregistrer un commentaire