ZOOKEEPER-4475: Fix NodeChildrenChanged delivered to recursive watcher
Cause:
- Events are delivered to recursive watcher unconditionally.
- Client could receive
NodeChildrenChangeddue to child watcher on recursive watching node's descendants.
Changes: Filter out NodeChildrenChanged events for recursive watcher in client side.
Ping @eolivelli @Randgalt for review.
Another ping @eolivelli @Randgalt @symat @maoling for review.
@eolivelli @Randgalt @symat @maoling @arshadmohammad @anmolnar Anyone have time capacity to review this pr ? I think @Randgalt has confirmed this as bug in ZOOKEEPER-4466.
Some other concerns:
- The description in this patch is different from JIRA's content:
This requires server to never deliver NodeChildrenChanged for node's descendants.
In this patch we filter event on the client side, which has nothing to do with server side logics.
- Do you have a real world workload needs this patch? I'm glad to see a concrete use case for PersistentRecursiveWatch and how the behavior changed here affects the application logic.
@tisonkun Thank you for reviews. I have pushed fixup commit to resolve your comments. Regarding your concerns, I try to reply them here:
- The next sentence after the quoted one is "It can not be true.". From server's perspective, it is valid to attach standard child watches on descendants of node which is being watched in persistent recursive mode. So, basically, server is innocent to this. It has to be solved in client side. I explains this in code comments in fixup commit also.
- No. I observed this in writing tests for zookeeper-client-rust. In the first place, I observed quirk behaviors in situations where standard watches and persistent watches are attached on overlapping paths(ZOOKEEPER-4466). After discussion with @Randgalt in ZOOKEEPER-4466, I split out ZOOKEEPER-4475 as a bug report and ZOOKEEPER-4466 remains more like a feature request.
@tisonkun would you have time to test if Curator still works well with this change ? like running all the tests of Curator
I'm glad to see a concrete use case for PersistentRecursiveWatch @tisonkun for instance Apache Pulsar relies heavily on PersistentRecursiveWatch since version 2.8.0 you can take a look to Pulsar code
@eolivelli I'll try to do a local test this week. Ping me if I miss it :)
@eolivelli since Curator doesn't support ZK 3.7+, I'm trying to pick this commit to 3.6.4-SNAPSHOT and run tests. However, I get a compile error when build with 3.6.4:
[ERROR] Bundle org.apache.zookeeper:zookeeper-jute:jar:3.6.4-SNAPSHOT : The default package '.' is not permitted by the Import-Package syntax.
This can be caused by compile errors in Eclipse because Eclipse creates
valid class files regardless of compile errors.
The following package(s) import from the default package [org.apache.zookeeper.data, org.apache.zookeeper.proto, org.apache.zookeeper.server.persistence, org.apache.zookeeper.server.quorum, org.apache.zookeeper.txn]
I have no idea whether this failure comes from.
Hi @eolivelli, I have done a test based on #1859(which is a superset of this pr) and apache/curator#426 on branch-3.7. Follow is the verify progress copied from my shell. From my observation, there are flakes, but not failures.
PS: How the community share test report ? (I have deleted the test report from this reply as it is too long. You could see it from initial edit.)
:warning: 52 God Classes were detected by Lift in this project. Visit the Lift web console for more details.
cc @eolivelli @Randgalt @symat @maoling @arshadmohammad @anmolnar for reviews.
cc @eolivelli @Randgalt @symat @maoling @arshadmohammad @anmolnar @ztzg for reviews.
Pending to merge if no more objection this week.
@tisonkun in ZooKeeper we do not allow to merge this kind of changes if there are not at least 2 sponsors.
Unfortunately currently there are not many active reviewers, but please always wait for a second approval before merging non trivial patches
@eolivelli OK.
@eolivelli I agree that it changes the event delivered and am OK to deliver it on 3.9.0, while it can be regarded as a bug fix since in 3.6.0 release notes we write:
Note that NodeChildrenChanged events are not triggered for persistent recursive watches as it would be redundant.