magda
magda copied to clipboard
Should remove racing condition in hooks service when hook is in async mode
Describe the bug
When a hook (e.g. a minion) is operating in async mode, the hooks service might process messages in wrong order, causing the hook's iswaitingforresponse
always tested to be true
therefore stopping the minion from processing records.
The expected message flow should be
- When minion interested records are created or updated,
WebHookProcessor
will send those records to the minion and wait for minion's response. - If the minion is operating in async mode, the
WebHookProcessor
will receivedeferResponse: true
response from the minion. When processing this response, it will setiswaitingforresponse
totrue
. - When the minion finishes processing the received records, it will send
ACK
request to the hooks service. Then the hooks service will immediately setiswaitingforresponse
tofalse
and start sending next batch of records to the minion if there are any.
However, the implementation of the above message processing has a pitfall: iswaitingforresponse
might be set in a reversed order because the deferResponse: true
response is processed via a message queue while ACK
is not. If WebHookProcessor
does not process deferResponse: true
fast enough, the ACK
message might be processed before it, causing iswaitingforresponse
to be incorrectly set to true
forever.
To Reproduce Steps to reproduce the behavior:
- Create a minion and set it to async mode.
- Simulate slow processing (10 seconds): Before https://github.com/magda-io/magda/blob/7a55fe884a77842210c96e848f97bc9001a19975/magda-registry-api/src/main/scala/au/csiro/data61/magda/registry/WebHookProcessor.scala#L259, insert
Thread.sleep(10000)
. - Create some new records that will be automatically sent to the minion, make sure a batch of records can be processed by the minion in less than 10 seconds.
- The registry log will report that the minion is busy. The hooks table will show that
iswaitingforresponse
istrue
and never change. The rest of records will not be processed.
Expected behavior All records should be processed eventually.
Technical Note
A possible solution is to process ACK
message in the same message queue as deferResponse: true
, which guarantees deferResponse: true
will be processed before ACK
. Therefore iswaitingforresponse
can be set correctly.