curator icon indicating copy to clipboard operation
curator copied to clipboard

LeaderSelector mutex resurrection

Open wrmsr opened this issue 10 years ago • 3 comments

I am not alone in experiencing situations in which my LeaderSelectors will wind up in an indefinite state of having no leader (relevant JIRA issues are listed below). This problem had been occurring nearly daily in our AWS environments (which regularly experience transient network issues). I believe I have solved this issue. This may not be the most elegant approach but, in the least, our deployments have behaved correctly since its activation.

This branch allows an InterProcessMutex to optionally reuse an existing acquisition. This of course breaks the contract of re-entrance as stated by the InterProcessLock interface but it is not done by default and only used specifically by the LeaderSelector (which is the only thing I am interested in using it for). I have a test reliably (though hackily) reproducing this issue but it is written in terms of an internal project and as I am unfamiliar with your test code I haven't ported it yet. All existing tests pass. The term resurrect probably isn't the best but hey, it works :p

Issues possibly fixed by this branch: https://issues.apache.org/jira/browse/CURATOR-3 https://issues.apache.org/jira/browse/CURATOR-171 https://issues.apache.org/jira/browse/CURATOR-188 https://issues.apache.org/jira/browse/CURATOR-202 https://issues.apache.org/jira/browse/CURATOR-205

wrmsr avatar Apr 28 '15 00:04 wrmsr

Any update on this? It seems like it might fix a lot of really critical bugs?

jolynch avatar May 29 '15 22:05 jolynch

We believe that Curator 2.8.0 fixes the problems with Leader Selector. Have you tested with it?

Randgalt avatar May 29 '15 22:05 Randgalt

Have you tried Curator 2.8.0?

Randgalt avatar Jun 20 '15 14:06 Randgalt

Closed as stale. There're many changes since this patch was made in the first place. Feel free to resubmit it if it's still relevant.

tisonkun avatar Sep 06 '22 13:09 tisonkun