bees
bees copied to clipboard
Using bees in combination with btrfs send/receive -p <parent>
Is it safe to use bees in combination with btrfs send/receive -p <parent>?
I know that I must not run bees and btrfs send/receive at the same time.
The documentation says: "You must not specify clone sources unless you guarantee that these snapshots are exactly in the same state on both sides--both for the sender and the receiver."
Suppose I use the following procedure:
-
I have a filesystem
FS1and on it a subvolume with data named/FS1/DATA. I also have a filesystemFS2available as/FS2. -
I take a snapshot:
btrfs subvolume snapshot -r /FS1/DATA /FS1/DATAs1 -
I send it to the FS2:
btrfs send /FS1/DATAs1 | btrfs receive /FS2/ -
Now: a) I'll let
beesrun onFS1, b) or I letbeesrun onFS2, c) or I letbeesrun on bothFS1andFS2. In the meantime, I continue to use/FS1/DATA. -
After a while, I stop
beesand take next snapshot:btrfs subvolume snapshot -r /FS1/DATA /FS1/DATAs2
Can I now safely transfer the new snapshot to FS2 with the command: btrfs send -p /FS1/DATAs1 /FS1/DATAs2 | btrfs receive /FS2/ ?
I know that I must not run bees and btrfs send/receive at the same time.
Why not? bees v0.11-rc3 will pause automatically while btrfs send is running. btrfs send may fail with EAGAIN if an active dedupe operation occurs at exactly the same time, but you can simply retry the send in that case.
Can I now safely transfer the new snapshot [...] to /FS2
This depends on what you mean by "safely."
-
If you mean "will the data on /FS2 match the data from /FS1?": Yes.
btrfs send/receiveshould produce an identical copy of the subvolume’s contents. Files and metadata should match the source snapshot. -
If you mean "will the storage space used on /FS2 be the same as on /FS1?" or "will deduplication effects be preserved or transferred?" or "will network transfers become unexpectedly larger?": Not necessarily. The physical space or network traffic required may be more or less than expected, especially if deduplication has been used on the sending filesystem. Since the receiver does not interact with the sender, deduplication on the receiving filesystem cannot affect the behavior of the sending filesystem during transfer.
btrfs senddoes not preserve the exact block-level topology: some reflinks (shared blocks) may be replaced with full copies, and vice versa. This is true whether or not deduplication (with bees or anything else) is in use, but deduplication can increase the variation in space usage and network traffic. Storage on /FS2 may differ from /FS1, even if file contents are identical.
If you run bees on both filesystems, the amount of space required should be minimized on both sides; however, due to quirks in btrfs’s storage and block allocation, you should not expect the same storage usage on both filesystems, even after deduplication.
If by "safe" you mean "no practical risk of data corruption," verify the received data after transfer. For a full check--including detecting extra or missing files--compare checksums after sorting:
(cd /FS1/DATAs2 && find . -type f -exec sha256sum {} + | sort) > /tmp/sums1
(cd /FS2/DATAs2 && find . -type f -exec sha256sum {} + | sort) > /tmp/sums2
diff -u /tmp/sums1 /tmp/sums2
You can use a direct pipe for a quicker check that doesn't require temporary files (note this will not detect extra files on /FS2):
(cd /FS1/DATAs2 && find -type f -exec sha256sum {} +) | (cd /FS2/DATAs2 && sha256sum -qc -)