ZOOKEEPER-2789 Reassign `ZXID` for solving 32bit overflow problem
The original PR link: https://github.com/apache/zookeeper/pull/262 Since the aforementioned PR does not support rolling hot updates, this PR aims to add rolling upgrade capabilities. The goal of this PR is to change the counter bit length from 32 to 40 and the epoch bit length from 32 to 20 through a rolling upgrade approach.
@kezhuw @eolivelli @li4wang @ztzg
I've just come across this one and started to review for 3.9.3, but need more eyeballs. Could you please review?
This pr will change the server side data format. I think it does not fit a patch release.
If it is
1k/sops, then as long as $2^{32} / (86400 * 1000) \approx 49.7$ days ZXID will exhausted. https://github.com/apache/zookeeper/pull/262#issue-230567070
Thinking about some abnormal situations, maybe 24 bit for epoch and 40 bit for counter is more better choice: M a t h . m i n ( 2 24 / ( 24 ∗ 365 ) , 2 40 / ( 86400 ∗ 1000 ∗ 365 ) ) ≈ M a t h . m i n ( 1915.2 , 34.9 ) = 34.9 years. https://github.com/apache/zookeeper/pull/262#issuecomment-303276951
So i offered a better solution is 24-bit epoch in second comment. Even if the frequency of leader election is once by every single hours, we will not experience the epoch overflow until 1915.2 years later. https://github.com/apache/zookeeper/pull/262#issuecomment-351886573
Given above, I think it is promising. It promotes rollover rate from 49.7 days to 34.9 years assuming 1k/s ops. The best is that it demands no protocol change at the price of zxid format change.
Before finalizing this path, I may want to taste whether leadership inheritance is feasible.
@zichen-gan You need to close / re-open PR or force push to trigger another CI run.
Given above, I think it is promising. It promotes rollover rate from
49.7days to34.9years assuming1k/sops. The best is that it demands no protocol change at the price of zxid format change. Before finalizing this path, I may want to taste whether leadership inheritance is feasible.
Sure, I'll wait for your review. Strange thing is that as you outlined it doesn't require protocol change, but still the patch has to increase protocol version.
as you outlined it doesn't require protocol change, but still the patch has to increase protocol version.
My fault! By "no protocol change", I mean we don't need to prove its correctness in ZAB.
Hi~ @anmolnar @kezhuw I would like to verify this feature in our production environment. After running mvn clean package -DskipTests on version 3.4.14, in which directory of Zookeeper is the complete installation package located?
on version 3.4.14, in which directory of Zookeeper is the complete installation package located
It is zookeeper-assembly/target/apache-zookeeper-3.10.0-SNAPSHOT-bin.tar.gz in master.
I would like to verify this feature in our production environment.
Please backup your data. This implementation is a one way ticket, a.k.a. it changes data storage format and probably has no way to downgrade. I presented an alternative #2208(ZOOKEEPER-4883) before.