sematic
sematic copied to clipboard
Detect and recover from run read-modify-write race conditions
There are a few places that do read-modify-write with runs:
- resolver jobs
- the server (for updating runs from external jobs)
- the worker (for storing outputs/exceptions)
- eventually the UI (adding tags and stuff)
Supposing you have two entities, E1 and E2, whose operations interleave as follows: E1.read E2.read E1.modify E2.modify E1.write E2.write
In this case, the modification from E1 will be overwritten by the modification from E2. There are ways to detect this situation so the writer can retry making its modification. We should do that to prevent weird bugs!