curator icon indicating copy to clipboard operation
curator copied to clipboard

[CURATOR-478] LeaderLatch accumulates additional watcher handlers

Open jira-importer opened this issue 7 years ago • 5 comments

In the event of a connection reconnect, LeaderLatch calls reset():

https://github.com/apache/curator/blob/9a03ea93937af047e8ad13c2e3e3559520abfb0a/curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderLatch.java#L613

Ultimately, this results in another call to getChildren(), which calls checkLeadership(), which registers another getData watch for the ephemeral leader record preceding our new leader record. However, the watch in place from before reset() is in place, and will trigger yet another watch in the event that the record it is watching gets deleted.

As such, the number of pending watchers (at least client side) will continue to increase each time the connection fails over.

Marked as trivial because I think it's unlikely these accumulate to the point that it's an issue, but it seems like it should at least be called out.


Originally reported by timcharper, imported from: LeaderLatch accumulates additional watcher handlers
  • assignee: randgalt
  • status: Open
  • priority: Trivial
  • resolution: Unresolved
  • imported: 2025-01-21

jira-importer avatar Sep 05 '18 19:09 jira-importer

randgalt:

I took a quick look and you're probably correct. A simple solution is to allocate the watcher as a field of the class. ZooKeeper guarantees that if you use the same watcher for the same path is only registered once.

jira-importer avatar Sep 07 '18 17:09 jira-importer

timcharper:

Okay, seems straightforward I'll take a stab at this

jira-importer avatar Sep 12 '18 03:09 jira-importer

timcharper:

Hmm, seems like we could also run into an issue with BackgroundCallback.

What would you think if we used an atomicInteger and incremented it each time, then cancelled if the atomic int value didn't match the value at the time we registered the callback?

jira-importer avatar Sep 12 '18 03:09 jira-importer

timcharper:

I'm honestly not sure how I would add a unit test for this. It would be really hard to reliably create the conditions leading to this issue.

jira-importer avatar Sep 12 '18 04:09 jira-importer