cloudbreak
cloudbreak copied to clipboard
CB-15763 Optimisticlockexception in StackUpdater caused termination f…
…low interrupt
In some cases there might be parallel flows. In that case updating the stack status might happen at the same time, causing OptimisticLockException or StaleObjectStateExceptions. This commit applies a retry logic around reading the Stack, updating and saving it.
See detailed description in the commit message.
I think you have seen this during a termination, and I think the conclusion was that it should be retried only when setting it to terminated. Like now with this change it's possible that the stack won't be in terminated state, if the other one comes later, isn't it? Also as both would use the same backoff strategy the collision could reoccur, so a random factor would be nice in that case.
Yes, this is for termination. I thought of making termination the winner over other flows, but finally rejected it. But I can modify it so that termination wins.
For the randomized backoff: if there are 2 parallel updates, then one will win, so only the other one will be retried, thus the next time there should be no collision. I set the retry to 4 for the situation when there are more parallel flow steps, like the non-termination flow is progressing fast.
then one will win, so only the other one will be retried,
yes, you are right, but I think in case of termination it should be the winner. No other case of this error should happen, and if happens I think it would be best that we are aware of it, instead of handling it with automatic retry. At least for now.
please check the integration test