curator [CURATOR-478] LeaderLatch accumulates additional watcher handlers

In the event of a connection reconnect, LeaderLatch calls reset():

https://github.com/apache/curator/blob/9a03ea93937af047e8ad13c2e3e3559520abfb0a/curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderLatch.java#L613

Ultimately, this results in another call to getChildren(), which calls checkLeadership(), which registers another getData watch for the ephemeral leader record preceding our new leader record. However, the watch in place from before reset() is in place, and will trigger yet another watch in the event that the record it is watching gets deleted.

As such, the number of pending watchers (at least client side) will continue to increase each time the connection fails over.

Marked as trivial because I think it's unlikely these accumulate to the point that it's an issue, but it seems like it should at least be called out.

Originally reported by timcharper, imported from: LeaderLatch accumulates additional watcher handlers

assignee: randgalt
status: Open
priority: Trivial
resolution: Unresolved
imported: 2025-01-21

Sep 05 '18 19:09 jira-importer

randgalt:

I took a quick look and you're probably correct. A simple solution is to allocate the watcher as a field of the class. ZooKeeper guarantees that if you use the same watcher for the same path is only registered once.

Sep 07 '18 17:09 jira-importer

timcharper:

Okay, seems straightforward I'll take a stab at this

Sep 12 '18 03:09 jira-importer

timcharper:

Hmm, seems like we could also run into an issue with BackgroundCallback.

What would you think if we used an atomicInteger and incremented it each time, then cancelled if the atomic int value didn't match the value at the time we registered the callback?

Sep 12 '18 03:09 jira-importer

timcharper:

Like this? https://github.com/timcharper/curator/commit/86aa3f164dfbd018dbb8780270b6374c57504efa

Sep 12 '18 04:09 jira-importer

timcharper:

I'm honestly not sure how I would add a unit test for this. It would be really hard to reliably create the conditions leading to this issue.

Sep 12 '18 04:09 jira-importer