design-patterns: How to prevent timing issue in an event based system?

lundi 11 mars 2019

How to prevent timing issue in an event based system?

I have an API (createOrUpdateItem) that gets called in order to create/update a record in a database (nosql) for an item (item A). Let's assume that this API takes 10 units of time to get executed and at t=9, the record is created in the database. The item also does not exist and a new record will be created. As part of the record being created, it makes several API calls in order to determine the final status of the record that will get created. Possible status is ONLINE/OFFLINE. For example,

Scenario 1:

t=0: createOrUpdateItem is called and processing starts
t=1: some computation
... 
... 
t=7: external API called is made in order to determine status (ONLINE) 
t=8: record starting to be created 
t=9: record created.

Now let's add the fact that the system reacts to events as well by listening to a queue system. For example, if the event happens for the same item, it will call the createOrUpdateItem API if a matching record is found from the table and update the status. For example at t=15 we receive a message for item A. Since there is a record for A, it will call createOrUpdateItem and the status will be updated to OFFLINE.

Now, consider the case where the createOrUpdateItem is called (event 1 where the item will be created in ONLINE status) and a message is received for item A while execution is still happening for event 1 as follows:

Scenario 2:

t=0: createOrUpdateItem is called and processing starts
t=1: some computation
... 
... 
t=7: external API called is made in order to determine status (status=ONLINE) 
t=8: record starting to be created. 
t=8.1: Message is received
t=8.2: Database is queried and there is no record. Therefore the createOrUpdateItem is not called again
t=9: record created with ONLINE status

As you can see here, record is created with ONLINE status instead of OFFLINE because of the timing of the message received.

My questions are:

how to prevent this timing issue from happening (changing the current architecture of the system is not feasible)
how can you fix this issue if it happens? What I can think about is having a daemon that periodically (let's say every 10 mins) looks at each record and calls the external API to determine if there has been a change in status. Here it can directly modify the record to update the status. Another possibility is to use another table as a lock. At t=0 when processing starts, first thing that happens is that a record is created in the lock table. If scenario 2 happens, then when the message is processed, it will query the lock table and see if there is a corresponding lock. If there is, then it will wait before the message gets processed.

Any thoughts/comments will be greatly appreciated. Thanks

design-patterns

lundi 11 mars 2019

How to prevent timing issue in an event based system?

Aucun commentaire:

Enregistrer un commentaire