bees
bees copied to clipboard
Option to ignore snapshots
Background:
When incremental backup is used, changing layout of snapshot partition causes that send fails because of changed extents.
example error: Send: inconsistent snapshot, found updated extent for inode 20930799 without updated inode item, send root is 3024, parent root is 3007
full send/receive is working without any errors.
(bees was disabled before backup started)
Solution: Option to disable/enable deduplication for specific subvolumes. (make some subvolumes immutable) This would case that disk space isn't freed until snapshot is removed, but in this case we only keeping single (daily) snapshot to be able to do incremental backup to different hard drive. So it is still beneficial to wait one day for extra space, and after initial deduplication process, it shouldn't gain much more free space.
Other observations: After initial deduplication was made and full backup was performed, next incremental backup also failed with the same reason. -> bees seams to sometimes replace old data with new duplicated ones.
bees can identify read-only snapshots by reading btrfs properties, so it doesn't need an option to identify them (though an option for an admin to include/exclude subvols in general might be useful for other purposes). bees could accept an option which controls whether read-only snapshots are mutated or not. The former would be incompatible with btrfs send, the latter would become the bees default.
I have considered using the read-only status of an extent ref to preferentially select extents that belong to read-only snapshots, i.e. if a read-only snapshot reference to an extent exists, bees will not try to remove the extent, and bees will prefer to use read-only extents to remove duplicates from read-write subvols if all other extent selection metrics are equal. I can add that requirement to the next rewrite of the crawler.
A simpler approach is to completely ignore extent refs from read-only snapshots. Assuming that read-only snapshots are always short-lived, bees could pretend they don't exist at all (no dedup, not even scanning, just fail silently in open_root), and dedup only the extents that (also) belong to mutable subvols. As you point out, this will delay recovery of free space until after the read-only snapshots are deleted; however, it may also provide a considerable speedup, as bees will not be wasting time scanning read-only subvols that will likely be deleted before the scan is finished.
bees does create a temporary copy of data when an extent contains both duplicate and unique data. In btrfs, the only way to delete an extent is to delete every reference to every block of the extent. When the duplicate part of the extent is removed, the remaining reference to the unique part keeps btrfs from deleting the duplicate part. bees solves this by making a separate copy of the unique part so the entire extent can be removed as a duplicate. This removes all references to the original extent so it will be deleted from the subvol. If there are snapshots this must be done separately on each snapshot subvol before the duplicate extent is finally removed from the filesystem.
I think second solution is most valuable for everyone regardless of use case, but if it is high complexity, in meantime third option would be enough
I'd prefer an option to make bees use read-only extents as the base extent for dedup. Am I guessing right this wouldn't mutate read-only snapshots then? OTOH, I myself am currently not a user of snapshots so currently I don't really care too much about how bees behaves here. But maybe later I would prefer this behavior, or alternatively have both the options to "prefer read-only extents" or "ignore read-only snapshots".
"Prefer read-only extents" is hard because bees currently only looks at one extent ref at any time (hence the name "best effort" as opposed to "best efficiency" or "best effectiveness"). We can only implement that once bees is considering multiple extents and selecting from them.
"Ignore read-only snapshots" could be implemented in bees right now.
My long-term plan (aka "roadmap") is to have bees examine each extent and act according to selection criteria leading to one of these outcomes:
- Replace extent, use as dst to remove duplicates, copy remaining data to new extents, combine with data from other extents to make larger extents, do not add to DDT hash index
- Extent is too short (128K compressed, 512K uncompressed) and logically adjacent to any extent which is also too short (combine these into a single extent for defrag)
- Extent has blocks which cannot be reached through any file in the filesystem trees
- Extent has blocks filled entirely with zero bytes (to be replaced with hole)
- Extent is compressed with wrong algorithm (e.g. zlib instead of zstd)
- Abandon extent entirely, use as neither src nor dst, do not add to DDT hash index
- Too many references (toxic, can't build an extent ref map with
LOGICAL_INO_V2) - Error encountered processing extent (e.g. EIO, ENOSPC)
- Too many references (toxic, can't build an extent ref map with
- Keep extent, use as src to remove duplicates, place in DDT hash index
- None of the criteria for Replace or Abandon are met
If a duplicate is found for data appearing in two or more Keep extents, sorting metrics are used to select the "best" Keep extent. This extent is used as dedup src, all non-selected extents become Replace extents used as dedup dst.
The Keep-extent sorting metrics are:
- extent length (keep longer)
- extent reference count (keep more-referenced)
- compressed vs uncompressed extent size (keep smaller)
For Replace extents, bees will search for a Keep extent containing duplicate data, and use the Keep extent as src to dedup the Replace extent out of existence. If no duplicate Keep-able extent is found, then bees will copy the data to new extents that will satisfy Keep criteria, and use these to replace existing Replace extents. Multiple Replace extents can be combined to implement defrag, so all Replace extents go onto a queue where they are further extended before any dedup commands are sent to btrfs.
We can add read-only snapshots to the above by making these changes (Plan A):
- Add new sorting metrics:
- number of read-only extent refs (if all other metrics are equal)
- Add new criteria for Abandon:
- Extent meets one or more criteria for Replace, but has at least one read-only reference
The last change makes it impossible to remove any duplicate extent referenced by a read-only snapshot, but chooses read-only snapshot extents to remove duplicate extents from read-write snapshots.
Instead of the last change, we could do Plan B:
- Add new sorting metrics:
- number of read-only extent refs (if all other metrics are equal)
- Add new rule for Replace:
- when evaluating block reachability, ignore extent refs in read-only subvols
- when replacing extents, skip extent refs in read-only subvols
This has two effects:
- Any block referenced only by read-only snapshots would be considered unreachable and its extent references in read-write subvols would be replaced, leaving the references in the read-only snapshots untouched. This would mean any block in a read-only snapshot that was removed from the origin subvol would trigger replacement of the extent if there are still blocks within that extent reachable from any read-write subvol. The original extents would be removed when the read-only subvol is deleted, but there would be extra space in use in the meantime.
- Any extent that was marked to be replaced for any reason (i.e. duplicate blocks, defrag, etc) would remain in the filesystem until the read-only snapshot was deleted, since the Replace operation would not touch the read-only references.
If we include read-only references when determining block reachability but exclude them when replacing extents, we end up with a lot of unreachable data taking up space on the filesystem: the data was reachable while the read-only snapshot exists, but then bees makes extra references to the data without considering which blocks become unreachable when the snapshot is deleted. The unreachable blocks would then be quite difficult to remove (essentially only brute force search would work) as their extent references would be scattered through the read-write origin subvols.
While there's a lot of high-quality bikeshedding here, the original issue (immutable snapshots were being mutated, breaking btrfs send) is covered by the fix for #79. There is now an option to ignore read-only snapshots.