bees icon indicating copy to clipboard operation
bees copied to clipboard

Using bees in combination with btrfs send/receive -p <parent>

Open art-latal opened this issue 6 months ago • 1 comments

Is it safe to use bees in combination with btrfs send/receive -p <parent>?

I know that I must not run bees and btrfs send/receive at the same time.

The documentation says: "You must not specify clone sources unless you guarantee that these snapshots are exactly in the same state on both sides--both for the sender and the receiver."

Suppose I use the following procedure:

  1. I have a filesystem FS1 and on it a subvolume with data named /FS1/DATA. I also have a filesystem FS2 available as /FS2.

  2. I take a snapshot: btrfs subvolume snapshot -r /FS1/DATA /FS1/DATAs1

  3. I send it to the FS2: btrfs send /FS1/DATAs1 | btrfs receive /FS2/

  4. Now: a) I'll let bees run on FS1, b) or I let bees run on FS2, c) or I let bees run on both FS1 and FS2. In the meantime, I continue to use /FS1/DATA.

  5. After a while, I stop bees and take next snapshot: btrfs subvolume snapshot -r /FS1/DATA /FS1/DATAs2

Can I now safely transfer the new snapshot to FS2 with the command: btrfs send -p /FS1/DATAs1 /FS1/DATAs2 | btrfs receive /FS2/ ?

art-latal avatar May 27 '25 11:05 art-latal

I know that I must not run bees and btrfs send/receive at the same time.

Why not? bees v0.11-rc3 will pause automatically while btrfs send is running. btrfs send may fail with EAGAIN if an active dedupe operation occurs at exactly the same time, but you can simply retry the send in that case.

Can I now safely transfer the new snapshot [...] to /FS2

This depends on what you mean by "safely."

  • If you mean "will the data on /FS2 match the data from /FS1?": Yes. btrfs send/receive should produce an identical copy of the subvolume’s contents. Files and metadata should match the source snapshot.

  • If you mean "will the storage space used on /FS2 be the same as on /FS1?" or "will deduplication effects be preserved or transferred?" or "will network transfers become unexpectedly larger?": Not necessarily. The physical space or network traffic required may be more or less than expected, especially if deduplication has been used on the sending filesystem. Since the receiver does not interact with the sender, deduplication on the receiving filesystem cannot affect the behavior of the sending filesystem during transfer. btrfs send does not preserve the exact block-level topology: some reflinks (shared blocks) may be replaced with full copies, and vice versa. This is true whether or not deduplication (with bees or anything else) is in use, but deduplication can increase the variation in space usage and network traffic. Storage on /FS2 may differ from /FS1, even if file contents are identical.

If you run bees on both filesystems, the amount of space required should be minimized on both sides; however, due to quirks in btrfs’s storage and block allocation, you should not expect the same storage usage on both filesystems, even after deduplication.

If by "safe" you mean "no practical risk of data corruption," verify the received data after transfer. For a full check--including detecting extra or missing files--compare checksums after sorting:

(cd /FS1/DATAs2 && find . -type f -exec sha256sum {} + | sort) > /tmp/sums1
(cd /FS2/DATAs2 && find . -type f -exec sha256sum {} + | sort) > /tmp/sums2
diff -u /tmp/sums1 /tmp/sums2

You can use a direct pipe for a quicker check that doesn't require temporary files (note this will not detect extra files on /FS2):

(cd /FS1/DATAs2 && find -type f -exec sha256sum {} +) | (cd /FS2/DATAs2 && sha256sum -qc -)

Zygo avatar May 28 '25 02:05 Zygo