cloudbreak icon indicating copy to clipboard operation
cloudbreak copied to clipboard

CB-15763 Optimisticlockexception in StackUpdater caused termination f…

Open gergopapi2 opened this issue 3 years ago • 3 comments

…low interrupt

In some cases there might be parallel flows. In that case updating the stack status might happen at the same time, causing OptimisticLockException or StaleObjectStateExceptions. This commit applies a retry logic around reading the Stack, updating and saving it.

See detailed description in the commit message.

gergopapi2 avatar Feb 10 '22 20:02 gergopapi2

I think you have seen this during a termination, and I think the conclusion was that it should be retried only when setting it to terminated. Like now with this change it's possible that the stack won't be in terminated state, if the other one comes later, isn't it? Also as both would use the same backoff strategy the collision could reoccur, so a random factor would be nice in that case.

Yes, this is for termination. I thought of making termination the winner over other flows, but finally rejected it. But I can modify it so that termination wins.

For the randomized backoff: if there are 2 parallel updates, then one will win, so only the other one will be retried, thus the next time there should be no collision. I set the retry to 4 for the situation when there are more parallel flow steps, like the non-termination flow is progressing fast.

gergopapi2 avatar Feb 16 '22 10:02 gergopapi2

then one will win, so only the other one will be retried,

yes, you are right, but I think in case of termination it should be the winner. No other case of this error should happen, and if happens I think it would be best that we are aware of it, instead of handling it with automatic retry. At least for now.

lacikaaa avatar Feb 16 '22 10:02 lacikaaa

please check the integration test

doktoric avatar Mar 29 '22 05:03 doktoric