dimanche 23 septembre 2018

Is it better to subscribe to changes or a full data set?

Let's say that I have two services A and B. Service A has a set of items like [ "foo", "bar" ] and is the service responsible for bringing the items into the system. Service B needs to subscribe to this data set. There are 2 ways it could do that:

  1. Subscribe to the full data set
  2. Subscribe to the changes

To give a concrete example, let's say the data set goes through evolutions like

  • [ "foo", "bar" ]
  • [ "foo", "bar", "baz" ]
  • [ "foo", "baz" ]
  • [ "foo", "baz", "buz" ]

The messages sent to service B with each change could be exactly what's above or it could be like

  • { add: [ "foo", "bar" ], subtract: [] }
  • { add: [ "baz" ], subtract: [] }
  • { add: [], subtract: [ "bar" ] }
  • { add: [ "buz" ], subtract: [] }

The question is which would be better. My view is that the first is better for the following reasons

  • Eventual consistency. In the first, if a message gets lost or there's some other failure so that B doesn't process a message, it will always be corrected when it successfully processes another message. In the second, there would be a lot of complexity to be added to the system to automatically correct these fail scenarios.
  • Idemptoency. Similar to the above point, if service A sends duplicate messages by accident, that's totally fine in the first design, whereas as the second design that case would require extra complexity to treat.
  • Delta is not really an optimization. Initially it sounds like an optimization to say "The rest of our system will only process the changes," but the thing is that someone has to compute the changes, so why should A rather than B, and how does this reduce the total number of operations?

Aucun commentaire:

Enregistrer un commentaire