zookeeper
zookeeper copied to clipboard
ZOOKEEPER-3318:[CLI way]Add a complete backup mechanism for zookeeper internal
-
takeSnapshot
api is just likesync
, only supports async call.
for no-realtime backup. write a new tool/shell: zkBackup.sh which is the reverse proces of the zkCleanup.sh for no-realtime backup.
- this way is not mainstream and not elegant, just using the
transferTo
copying the snapshots in the dataDir.so I delete the related codes. - snapshot on a path with no permission.
[zk: 127.0.0.1:2180(CONNECTED) 3] snapshot /data/forbidden_dir Snapshot has failed. rc=-124 2019-05-12 19:02:49,839 [myid:] - WARN [SyncThread:0:FinalRequestProcessor@462] - Unexpected exception when taking the snapshot in the directory:/data/forbidden_dir java.io.FileNotFoundException: /data/forbidden_dir/version-2/snapshot.fa0100018955 (Permission denied) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.
(FileOutputStream.java:213) at java.io.FileOutputStream. (FileOutputStream.java:162) at org.apache.zookeeper.server.persistence.SnapStream.getOutputStream(SnapStream.java:129) at org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:214) at org.apache.zookeeper.server.persistence.FileTxnSnapLog.save(FileTxnSnapLog.java:435) at org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:415) at org.apache.zookeeper.server.ZooKeeperServer.takeSnapshotExternal(ZooKeeperServer.java:386) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:460) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:154
- more details in ZOOKEEPER-3318
Directed snapshots seem useful to me; I have reservations about snapshot taking being triggered by the ZooKeeper client. If it's just a way to freeze the FinalRequestProcessor and take a synchronous snapshot then is there a way to achieve that while using the admin server or jmx as the entry point?
Let me take a close look at the PR-180,After that,I will answer all the concerns together.
If it's just a way to freeze the FinalRequestProcessor and take a synchronous snapshot ?
@enixon
- A good insight. we can do the following in the
FinalRequestProcessor
.
case OpCode.takeSnapshot: {
------------------------------------------------------
// new ZooKeeperThread("Client Snapshot Thread") {
// public void run() {
// try {
// zks.takeSnapshotExternal(dir);
// } catch (IOException e) {
// }
// }
// }.start();
- the really headache is the security issue to implement it with CLI:
Since this PR is targeting master I suggest considering the option of adding a snap API to ZooKeeperAdmin, which is recently introduced to harden security around dynamic reconfiguration. ZooKeeperAdmin supports all sorts of authentications built in ZK and we can extend it such that only admin (or any users that explicitly being granted admin access to cluster) can issue snap command.
@hanm ZooKeeperAdmin
currently may not support the authentications issue.I see the authentications of the CLI:reconfig
is dependent on the write permission on the node /zookeeper/config
- Overall, let us do this with admin server firstly.
@maoling @enixon I have similar thoughts. It's a great idea of triggering backup using admin server as the entry point. However, taking a snapshot from in-memory data tree without backing up transaction logs seems not sufficient, as the snapshot is fuzzy and may not represent the state of the data tree at any point of time. We may have data loss or data integrity issue in the quorum lost case if the transaction logs are not backed up.
I wonder if there are any thoughts or discussions on providing a complete backup solution without the risk of losing data?
Directed snapshots seem useful to me; I have reservations about snapshot taking being triggered by the ZooKeeper client. If it's just a way to freeze the FinalRequestProcessor and take a synchronous snapshot then is there a way to achieve that while using the admin server or jmx as the entry point?