CURATOR-696. Fix double leader for LeaderLatch
A possible fix to CURATOR-696.
- Refer to https://github.com/apache/druid/issues/16411.
- Refer to #430.
I'm thinking whether this has more corner cases ...
cc @gianm @kezhuw @eolivelli
Fri, 10 May 2024 08:03:01 GMT [INFO] Running org.apache.curator.framework.recipes.leader.TestLeaderLatch
Fri, 10 May 2024 10:33:49 GMT Error: The operation was canceled.
Test timeout now. It seems we introduce new edge cases. Investigating ...
@kezhuw Thanks for your review.
a watcher for ourPath to gain ability to "perceive" external deletion
Curator has a basic assumption that we own the node (TN-7). This patch fixes a bug that we don't manage the node (ephemeral owner, a.k.a, session id) properly. We may not handle "external deletion".
And such a watcher doesn't fix our issue here, because it can take some time the deletion pass to the client before it gives up the leadership.
@eolivelli pushed. Ensure that it fails without this patch and passes with this patch.
Reproduced with a more handy tests. The test pushed above can prevent regression already.
I'll merge this patch later this week if no more inputs.
I'd still seek possibility to reduce the test consumption since it now takes 140+ minutes to finish ...
@tisonkun do you know when the 5.7.0 release will be made available?
@razinbouzar I'll start the discussion today and hopefully vote in two to three weeks. You can keep an eye on the dev mailing list ([email protected]).
@razinbouzar Curator 5.7.0 has released.