[CURATOR-339] CuratorFramework misses session expired events
I was using CuratorFramework and was unable to see some expected session expiration events.
Here's a simpler demonstration of these missed events (without using CuratorFramework)
Code that works:
This code doesn't misses any events, note that here I am holding onto a particular zk instance through out.
import org.apache.curator.CuratorZookeeperClient; import org.apache.curator.retry.RetryOneTime; import org.apache.zookeeper.CreateMode; import org.apache.zookeeper.KeeperException; import org.apache.zookeeper.WatchedEvent; import org.apache.zookeeper.Watcher; import org.apache.zookeeper.ZooDefs; import org.apache.zookeeper.ZooKeeper;public class CuClient implements Watcher { public static void main(String[] args) throws Exception { CuClient zkc = new CuClient(); zkc.connect(args[0]); }
private void connect(String zkConnect) throws Exception { CuratorZookeeperClient czc = new CuratorZookeeperClient( zkConnect, 4000, 1000, this, new RetryOneTime(1000) ); czc.start();
ZooKeeper zk = czc.getZooKeeper(); <span class="code-keyword">for</span> (<span class="code-object">int</span> i = 0; i < 100; i++) { <span class="code-object">Thread</span>.sleep(1000); <span class="code-object">System</span>.out.println(<span class="code-quote">"creating "</span> + i); <span class="code-keyword">try</span> {zk.create("/rollup/" + i, null, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); } catch (KeeperException ke) { System.out.println(ke.getMessage()); } }
czc.close();}
@Override public void process(WatchedEvent watchedEvent) { System.out.println(watchedEvent); } }
I caused a network partition after node 6 and then healed it after the session expired
WatchedEvent state:SyncConnected type:None path:null creating 0 creating 1 creating 2 creating 3 creating 4 creating 5 creating 6 WatchedEvent state:Disconnected type:None path:null creating 7 KeeperErrorCode = ConnectionLoss for /rollup/7 creating 8 KeeperErrorCode = ConnectionLoss for /rollup/8 creating 9 KeeperErrorCode = ConnectionLoss for /rollup/9 creating 10 KeeperErrorCode = ConnectionLoss for /rollup/10 creating 11 WatchedEvent state:Expired type:None path:null WatchedEvent state:SyncConnected type:None path:null KeeperErrorCode = Session expired for /rollup/11 creating 12 KeeperErrorCode = Session expired for /rollup/12 creating 13 KeeperErrorCode = Session expired for /rollup/13 creating 14 KeeperErrorCode = Session expired for /rollup/14 ^C
Code that doesn't work:
This is the same code, except instead of holding onto a zookeeper instance, I am calling getZookeeper again and again.
import org.apache.curator.CuratorZookeeperClient; import org.apache.curator.retry.RetryOneTime; import org.apache.zookeeper.CreateMode; import org.apache.zookeeper.KeeperException; import org.apache.zookeeper.WatchedEvent; import org.apache.zookeeper.Watcher; import org.apache.zookeeper.ZooDefs; import org.apache.zookeeper.ZooKeeper;public class CuClient implements Watcher { public static void main(String[] args) throws Exception { CuClient zkc = new CuClient(); zkc.connect(args[0]); }
private void connect(String zkConnect) throws Exception { CuratorZookeeperClient czc = new CuratorZookeeperClient( zkConnect, 4000, 1000, this, new RetryOneTime(1000) ); czc.start();
<span class="code-keyword">for</span> (<span class="code-object">int</span> i = 0; i < 100; i++) { <span class="code-object">Thread</span>.sleep(1000); <span class="code-object">System</span>.out.println(<span class="code-quote">"creating "</span> + i); <span class="code-keyword">try</span> {czc.getZooKeeper().create("/rollup/" + i, null, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); } catch (KeeperException ke) { System.out.println(ke.getMessage()); } }
czc.close();}
@Override public void process(WatchedEvent watchedEvent) { System.out.println(watchedEvent); } }
Again, network partitioned after node 5 and then healed once the session expired. Note the missing 'WatchedEvent state:Expired type:None path:null' line.
WatchedEvent state:SyncConnected type:None path:null creating 0 creating 1 creating 2 creating 3 creating 4 creating 5 WatchedEvent state:Disconnected type:None path:null creating 6 KeeperErrorCode = ConnectionLoss for /rollup/6 creating 7 KeeperErrorCode = ConnectionLoss creating 8 KeeperErrorCode = ConnectionLoss creating 9 KeeperErrorCode = ConnectionLoss for /rollup/9 creating 10 KeeperErrorCode = ConnectionLoss creating 11 KeeperErrorCode = ConnectionLoss WatchedEvent state:SyncConnected type:None path:null creating 12 creating 13 creating 14 creating 15 creating 16 creating 17 ^C
The same happens when I use CuratorFramework (I get SyncConnected events but no Expired events). Am I doing something wrong? I want to do all the operations in one session (because I am creating some ephemeral nodes etc.) I want a reliable way of getting notified when the session expired.
Originally reported by ragarwal, imported from: CuratorFramework misses session expired events
- assignee: randgalt
- status: Open
- priority: Major
- resolution: Unresolved
- imported: 2025-01-21
[Originally related to: CURATOR-338]