btrbk icon indicating copy to clipboard operation
btrbk copied to clipboard

Proposal for rotation of raw incremental backups

Open sbrudenell opened this issue 2 years ago • 18 comments

I propose a related set of features to enable automatic rotation (deletion) of incremental raw backups.

  1. Honor target_preserve, additionally preserving ancestors. EDIT: already implemented. ~~btrbk should preserve backups as specified by the user, as well as the chain of backups required to construct the desired preserved backups. btrbk can walk the tree of RECEIVED_PARENT_UUID relationships given by *.info files to determine which additional backups need to be preserved. AFAICT btrbk already has some logic to determine this, but it's not used (?), since the preserve mechanism for raw targets isn't used.~~
  2. Provide a way to balance the ancestor tree, e.g. an explicit schedule configuration in place of incremental_prefs. Currently the parent of each backup is usually the previous backup, so the ancestor tree is just a long chain, and preserving ancestors would mean preserving everything. We need to shorten and balance this tree to have meaningful backup rotation. Tree management is a broad topic, but it would be useful to start with some initial approach (of course, the one that works for my use case :smile:) and expand later.
  3. Already implemented, but we should document that incremental raw backups only use a single parent (-p, but not -c). It would be nice to document the trade-off here: multiple parents would increase risk that a broken backup would break more descendant backups, and would make backups harder to understand, and the only benefit of -c is saving data storage in the case of cross-snapshot reflinks, which are probably rare.

As a starting point for tree management, I suggest a schedule config for incremental parents, similar to retention policies. The concept is that each level of the ancestor tree corresponds to one of the yearly/monthly/weekly/daily/hourly schedule concepts, which are already well-defined in btrbk.

incremental_schedule 1y 1m 1d could mean that:

  • A yearly backup has no parent (full backup)
  • A monthly backup uses the latest yearly backup as its parent
  • A daily backup uses the latest monthly backup as its parent
  • All other backups (hourly, ad-hoc, etc) default to using the latest backup as their parent

My use case

I have a large (300gb compressed) homedir in which I do regular daily work, but most of the data rarely changes. I back up to Backblaze B2 via s3fs. I use a 90-day object retention policy for safety.

I don't want to do full backups very often, to minimize disk/network/cpu usage, and because object retention means retaining a large backup as well as its nearly-identical replacement, whenever I do a full backup.

I want to back up as frequently as possible.

(5-second backup intervals would be ideal. Why? Because it's $YEAR and I expect my work to always be backed up (think Google Docs). Because we as a species have already done the hard work to develop btrfs, which supports arbitrarily many atomic snapshots, onchange semantics, and easy deltas. Because backblaze doesn't charge per-object fees. The hard parts are all solved, and backup software shouldn't stop me from making use of them. That said, s3fs' caching is poor and issues HeadObject requests too often, so my backups are more infrequent to lower active object counts, and my dreams are once more dashed by one piece of bad software)

I would use a config like this:

volume /mnt/btr_pool
subvolume home
target raw /s3
incremental_schedule 1y 1m 1d  # proposed new config
target_preserve 30d
target_preserve_min no
snapshot_preserve 24h
snapshot_preserve_min latest
  • The first backup of the year will be a full backup
  • The first backup of each month will use the yearly backup as a parent
  • The first backup of each day will use the last monthly backup as its parent
  • The last 30 daily backups will be preserved, but ancestor preservation means the yearly backups and snapshots will be preserved too. In particular:
    • On 2023-01-15, we'll preserve:
      • Backups from 2022-12-16 through 2023-01-15, by target_preserve
      • Backups for 2022-12-01 and 2022-01-01, by ancestor preservation
      • The snapshot for 2023-01-01, by incremental_schedule, as we'll need it as the parent of tomorrow's daily backup
    • On 2023-06-15, we'll preserve:
      • Backups from 2023-05-15 through 2023-06-15, by target_preserve
      • Backups for 2023-05-01 and 2023-01-01, by ancestor preservation
      • The snapshot for 2023-06-01, by incremental_schedule, as we'll need it as the parent of tomorrow's daily backup
      • The snapshot for 2023-01-01, by incremental_schedule, as we'll need it as the parent of next month's monthly backup

sbrudenell avatar May 05 '22 13:05 sbrudenell

If I understand your points (1) and (2) correctly, then I'd very much like to see that.

My use case is that I want to backup filesystems from servers to some remote locations (possibly 1-n for redundancy).

The remote locations cannot be (fully) trusted, so I can only work with raw backup files, which need to be encrypted on the origin server before being transferred.

What I'd like to see is that e.g. every 7 days a new full backup is made, and the next 6 days only incremental backups.
The idea here is not only to get rid of any removed data that would still be in the chain of incremental backup files,... but also to give some more resilience, should one of the previous backup files (the full or an incremental) become corrupted.

Of course btrbk would need to be smart enough, if I e.g. want to rotate 10 days of daily backups, and every 7 days a new full one,.. that it needs to keep, after the 7th day, not just 3 more incremental backup files, but the full chain back to the next full backup.

calestyo avatar May 06 '22 04:05 calestyo

Apparently btrbk does delete incremental raw files today, and does preserve ancestors.

I see the following debug output when incremental raw backups would be deleted per target_preserve:

Found parent/child partners, forcing preserve of: "/target/@home.20220508T165631+0000.btrfs.gz", "/target/@home.20220508T171336+0000.btrfs.gz"
Found parent/child partners, forcing preserve of: "/target/@home.20220508T171336+0000.btrfs.gz", "/target/@home.20220508T195415+0000.btrfs.gz"

This contradicts the docs:

Note that the target preserve mechanism is currently disabled for incremental raw backups (btrbk does not delete any incremental raw files)!

It's nice when out-of-date documentation actually means there are surprise features!

sbrudenell avatar May 08 '22 20:05 sbrudenell

If that's the case, the warning at the bottom of https://github.com/digint/btrbk#example-encrypted-backup-to-non-btrfs-target would likely also be obsolete?

Also the line:

There is currently no support for rotation of incremental backups: if incremental is set, a full backup must be triggered manually from time to time in order to be able to delete old backups.

in the btrbk.conf(5) manpage?

calestyo avatar Nov 10 '22 05:11 calestyo

@sbrudenell I've just tried the above, but in my setup, btrbk does not delete old ancestors:

ACTION  HOST         SUBVOLUME                                                         SCHEME  REASON
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T021654+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T022516+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T040053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T050053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T060042+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T070053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T080053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T090053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T100053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T110053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T120053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T130053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T140053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T150053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T160053+0100.btrfs  7h      preserve hourly: first of hour, 7 hours ago
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T170000+0100.btrfs  7h      preserve hourly: first of hour, 6 hours ago
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T180053+0100.btrfs  7h      preserve hourly: first of hour, 5 hours ago
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T190053+0100.btrfs  7h      preserve hourly: first of hour, 4 hours ago
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T200053+0100.btrfs  7h      preserve hourly: first of hour, 3 hours ago
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T210053+0100.btrfs  7h      preserve hourly: first of hour, 2 hours ago
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T220053+0100.btrfs  7h      preserve hourly: first of hour, 1 hours ago
-       example.org  /var/local/lcg-backup/data/btrbk/data.20221112T230053+0100.btrfs  7h      preserve hourly: first of hour, 0 hours ago

Neither does it automagically create new full dumps.

Did you somehow do this manually?

calestyo avatar Nov 12 '22 22:11 calestyo

I tried to solve this now, by having two sets of systemd timers and services: The ususal btrbk.timer/btrbk.service (which runs in the test example every 4th hour and makes a non-incremental send) and a btrbk-incremental.timer/btrbk-incremental.service (which run every non-4th hour and makes an incremental one).

The use the same config file, but the former calls btrbk with --override incremental=no.

On the remote backup hosts's side this works "fine":

-rw-r--r-- 1 root       root       6,5G Nov 12 02:23 data.20221112T021654+0100.btrfs
-rw-r--r-- 1 root       root        143 Nov 12 02:23 data.20221112T021654+0100.btrfs.info
-rw-r--r-- 1 root       root        87M Nov 12 02:25 data.20221112T022516+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 02:25 data.20221112T022516+0100.btrfs.info
-rw-r--r-- 1 root       root        96M Nov 12 04:00 data.20221112T040053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 04:00 data.20221112T040053+0100.btrfs.info
-rw-r--r-- 1 root       root        96M Nov 12 05:00 data.20221112T050053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 05:00 data.20221112T050053+0100.btrfs.info
-rw-r--r-- 1 root       root        95M Nov 12 06:00 data.20221112T060042+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 06:00 data.20221112T060042+0100.btrfs.info
-rw-r--r-- 1 root       root        93M Nov 12 07:00 data.20221112T070053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 07:00 data.20221112T070053+0100.btrfs.info
-rw-r--r-- 1 root       root        95M Nov 12 08:00 data.20221112T080053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 08:00 data.20221112T080053+0100.btrfs.info
-rw-r--r-- 1 root       root        94M Nov 12 09:00 data.20221112T090053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 09:00 data.20221112T090053+0100.btrfs.info
-rw-r--r-- 1 root       root        93M Nov 12 10:00 data.20221112T100053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 10:00 data.20221112T100053+0100.btrfs.info
-rw-r--r-- 1 root       root        96M Nov 12 12:00 data.20221112T110053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 12:00 data.20221112T110053+0100.btrfs.info
-rw-r--r-- 1 root       root        93M Nov 12 12:01 data.20221112T120053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 12:01 data.20221112T120053+0100.btrfs.info
-rw-r--r-- 1 root       root        95M Nov 12 13:00 data.20221112T130053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 13:00 data.20221112T130053+0100.btrfs.info
-rw-r--r-- 1 root       root        93M Nov 12 14:00 data.20221112T140053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 14:00 data.20221112T140053+0100.btrfs.info
-rw-r--r-- 1 root       root        98M Nov 12 15:00 data.20221112T150053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 15:00 data.20221112T150053+0100.btrfs.info
-rw-r--r-- 1 root       root        94M Nov 12 16:00 data.20221112T160053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 16:00 data.20221112T160053+0100.btrfs.info
-rw-r--r-- 1 root       root        92M Nov 12 17:00 data.20221112T170000+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 17:00 data.20221112T170000+0100.btrfs.info
-rw-r--r-- 1 root       root        93M Nov 12 18:01 data.20221112T180053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 18:01 data.20221112T180053+0100.btrfs.info
-rw-r--r-- 1 root       root        94M Nov 12 19:00 data.20221112T190053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 19:00 data.20221112T190053+0100.btrfs.info
-rw-r--r-- 1 root       root        93M Nov 12 20:01 data.20221112T200053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 20:01 data.20221112T200053+0100.btrfs.info
-rw-r--r-- 1 root       root        92M Nov 12 21:00 data.20221112T210053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 21:00 data.20221112T210053+0100.btrfs.info
-rw-r--r-- 1 root       root        92M Nov 12 22:00 data.20221112T220053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 22:00 data.20221112T220053+0100.btrfs.info
-rw-r--r-- 1 root       root        92M Nov 12 23:00 data.20221112T230053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 12 23:00 data.20221112T230053+0100.btrfs.info
-rw-r--r-- 1 root       root       6,5G Nov 13 00:09 data.20221113T000031+0100.btrfs
-rw-r--r-- 1 root       root        143 Nov 13 00:09 data.20221113T000031+0100.btrfs.info
-rw-r--r-- 1 root       root        94M Nov 13 01:00 data.20221113T010021+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 01:00 data.20221113T010021+0100.btrfs.info
-rw-r--r-- 1 root       root        91M Nov 13 02:00 data.20221113T020053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 02:00 data.20221113T020053+0100.btrfs.info
-rw-r--r-- 1 root       root        91M Nov 13 03:00 data.20221113T030053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 03:00 data.20221113T030053+0100.btrfs.info
-rw-r--r-- 1 root       root       6,5G Nov 13 04:09 data.20221113T040053+0100.btrfs
-rw-r--r-- 1 root       root        143 Nov 13 04:09 data.20221113T040053+0100.btrfs.info
-rw-r--r-- 1 root       root        90M Nov 13 05:00 data.20221113T050053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 05:00 data.20221113T050053+0100.btrfs.info
-rw-r--r-- 1 root       root        91M Nov 13 06:00 data.20221113T060053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 06:00 data.20221113T060053+0100.btrfs.info
-rw-r--r-- 1 root       root        91M Nov 13 07:00 data.20221113T070043+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 07:00 data.20221113T070043+0100.btrfs.info
-rw-r--r-- 1 root       root       6,5G Nov 13 08:08 data.20221113T080031+0100.btrfs
-rw-r--r-- 1 root       root        143 Nov 13 08:08 data.20221113T080031+0100.btrfs.info
-rw-r--r-- 1 root       root        91M Nov 13 09:00 data.20221113T090053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 09:00 data.20221113T090053+0100.btrfs.info
-rw-r--r-- 1 root       root        92M Nov 13 10:00 data.20221113T100031+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 10:00 data.20221113T100031+0100.btrfs.info
-rw-r--r-- 1 root       root        91M Nov 13 11:00 data.20221113T110053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 11:00 data.20221113T110053+0100.btrfs.info
-rw-r--r-- 1 root       root       6,5G Nov 13 12:07 data.20221113T120053+0100.btrfs
-rw-r--r-- 1 root       root        143 Nov 13 12:07 data.20221113T120053+0100.btrfs.info
-rw-r--r-- 1 root       root        90M Nov 13 13:00 data.20221113T130012+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 13:00 data.20221113T130012+0100.btrfs.info
-rw-r--r-- 1 root       root        90M Nov 13 14:00 data.20221113T140053+0100.btrfs
-rw-r--r-- 1 root       root        201 Nov 13 14:00 data.20221113T140053+0100.btrfs.info

In the beginning there are many incremental sends where I hadn't set up the scheme yet. Starting with 20221113T00 I had it in place, and that was also the first fresh non-incremental send (all those with 6,5G are, the smaller ones are incremental).

Problem is now, the old ones don't get cleaned up:

# btrbk -S dryrun
SNAPSHOT SCHEDULE
action  host  subvol                                                  scheme                     reason
------  ----  ------------------------------------------------------  -------------------------  --------------------------------------------------------
-       -     /data/system/snapshots/btrbk/data.20221112T021654+0100  2d+ 7h 1d (sunday, 00:00)  preserve daily: first of day, 1 days ago, 2h after 00:00
-       -     /data/system/snapshots/btrbk/data.20221112T022516+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T040053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T050053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T060042+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T070053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T080053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T090053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T100053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T110053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T120053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T130053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T140053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T150053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T160053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T170000+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T180053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T190053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T200053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T210053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T220053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221112T230053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       -     /data/system/snapshots/btrbk/data.20221113T000031+0100  2d+ 7h 1d (sunday, 00:00)  preserve daily: first of day, 0 days ago, at 00:00
-       -     /data/system/snapshots/btrbk/data.20221113T010021+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 0 days ago
-       -     /data/system/snapshots/btrbk/data.20221113T020053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 0 days ago
-       -     /data/system/snapshots/btrbk/data.20221113T030053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 0 days ago
-       -     /data/system/snapshots/btrbk/data.20221113T040053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 0 days ago
-       -     /data/system/snapshots/btrbk/data.20221113T050053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 0 days ago
-       -     /data/system/snapshots/btrbk/data.20221113T060053+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 0 days ago
-       -     /data/system/snapshots/btrbk/data.20221113T070043+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 0 days ago
-       -     /data/system/snapshots/btrbk/data.20221113T080031+0100  2d+ 7h 1d (sunday, 00:00)  preserve hourly: first of hour, 7 hours ago
-       -     /data/system/snapshots/btrbk/data.20221113T090053+0100  2d+ 7h 1d (sunday, 00:00)  preserve hourly: first of hour, 6 hours ago
-       -     /data/system/snapshots/btrbk/data.20221113T100031+0100  2d+ 7h 1d (sunday, 00:00)  preserve hourly: first of hour, 5 hours ago
-       -     /data/system/snapshots/btrbk/data.20221113T110053+0100  2d+ 7h 1d (sunday, 00:00)  preserve hourly: first of hour, 4 hours ago
-       -     /data/system/snapshots/btrbk/data.20221113T120053+0100  2d+ 7h 1d (sunday, 00:00)  preserve hourly: first of hour, 3 hours ago
-       -     /data/system/snapshots/btrbk/data.20221113T130012+0100  2d+ 7h 1d (sunday, 00:00)  preserve hourly: first of hour, 2 hours ago
-       -     /data/system/snapshots/btrbk/data.20221113T140053+0100  2d+ 7h 1d (sunday, 00:00)  preserve hourly: first of hour, 1 hours ago
-       -     /data/system/snapshots/btrbk/data.20221113T154047+0100  2d+ 7h 1d (sunday, 00:00)  preserve hourly: first of hour, 0 hours ago

BACKUP SCHEDULE
action  host                      subvol                                                            scheme  reason
------  ------------------------  ----------------------------------------------------------------  ------  ---------------------------------------------
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T021654+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T022516+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T040053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T050053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T060042+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T070053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T080053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T090053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T100053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T110053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T120053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T130053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T140053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T150053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T160053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T170000+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T180053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T190053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T200053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T210053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T220053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221112T230053+0100.btrfs  7h      preserve forced: child of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T000031+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T010021+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T020053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T030053+0100.btrfs  7h      preserve forced: child of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T040053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T050053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T060053+0100.btrfs  7h      preserve forced: parent of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T070043+0100.btrfs  7h      preserve forced: child of another raw target
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T080031+0100.btrfs  7h      preserve hourly: first of hour, 7 hours ago
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T090053+0100.btrfs  7h      preserve hourly: first of hour, 6 hours ago
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T100031+0100.btrfs  7h      preserve hourly: first of hour, 5 hours ago
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T110053+0100.btrfs  7h      preserve hourly: first of hour, 4 hours ago
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T120053+0100.btrfs  7h      preserve hourly: first of hour, 3 hours ago
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T130012+0100.btrfs  7h      preserve hourly: first of hour, 2 hours ago
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T140053+0100.btrfs  7h      preserve hourly: first of hour, 1 hours ago
-       example.org               /var/local/lcg-backup/data/btrbk/data.20221113T154047+0100.btrfs  7h      preserve hourly: first of hour, 0 hours ago

--------------------------------------------------------------------------------
Backup Summary (btrbk command line client, version 0.27.1)

    Date:   Sun Nov 13 15:40:47 2022
    Config: /etc/btrbk/btrbk.conf
    Dryrun: YES

Legend:
    ===  up-to-date subvolume (source snapshot)
    +++  created subvolume (source snapshot)
    ---  deleted subvolume
    ***  received subvolume (non-incremental)
    >>>  received subvolume (incremental)
--------------------------------------------------------------------------------
/data/system/data
+++ /data/system/snapshots/btrbk/data.20221113T154047+0100
>>> example.org:/var/local/lcg-backup/data/btrbk/data.20221113T154047+0100.btrfs

NOTE: Dryrun was active, none of the operations above were actually executed!

That was with 0.27.1 from Debian stable, but I also tried with 0.32.5-1, no difference.

I don't quite understand why the old ones don't get dropped... there should be no dependencies because of the fresh full sends, and my backup retention policy is (for testing) only 7h.

Any ideas?

@digint ?

calestyo avatar Nov 13 '22 14:11 calestyo

These are the .info sidecars... AFAICS, the chain should be correct, and restarts whenever there had been a full send:

-e -n #btrbk-v0.27.1
#Do not edit this file
TYPE=raw
FILE=data.20221112T021654+0100.btrfs
RECEIVED_UUID=87b82fc0-0097-1e4b-b43a-d7518ddcf60e

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T022516+0100.btrfs
RECEIVED_UUID=bff6da7e-545e-fb4e-ab2f-90a1c816ba59
RECEIVED_PARENT_UUID=87b82fc0-0097-1e4b-b43a-d7518ddcf60e

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T040053+0100.btrfs
RECEIVED_UUID=ededb2e8-0e3a-3a47-8a84-c7caf25e88d1
RECEIVED_PARENT_UUID=bff6da7e-545e-fb4e-ab2f-90a1c816ba59

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T050053+0100.btrfs
RECEIVED_UUID=b79669a1-0794-d244-9c15-f4ba4fc5c767
RECEIVED_PARENT_UUID=ededb2e8-0e3a-3a47-8a84-c7caf25e88d1

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T060042+0100.btrfs
RECEIVED_UUID=b60dea8f-a3cd-104a-aef5-760feb02cfe5
RECEIVED_PARENT_UUID=b79669a1-0794-d244-9c15-f4ba4fc5c767

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T070053+0100.btrfs
RECEIVED_UUID=7f4f689f-a3a7-ac40-9e0a-6e94357bccfa
RECEIVED_PARENT_UUID=b60dea8f-a3cd-104a-aef5-760feb02cfe5

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T080053+0100.btrfs
RECEIVED_UUID=f96a05ae-8391-d843-89c5-bf66d0190242
RECEIVED_PARENT_UUID=7f4f689f-a3a7-ac40-9e0a-6e94357bccfa

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T090053+0100.btrfs
RECEIVED_UUID=ba7d366e-b4a0-3448-b732-86470625fdb4
RECEIVED_PARENT_UUID=f96a05ae-8391-d843-89c5-bf66d0190242

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T100053+0100.btrfs
RECEIVED_UUID=f1fcda53-1c32-5247-ba85-2ec31cc46829
RECEIVED_PARENT_UUID=ba7d366e-b4a0-3448-b732-86470625fdb4

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T110053+0100.btrfs
RECEIVED_UUID=cf48f6e0-6853-5644-ad46-a1023577e6c2
RECEIVED_PARENT_UUID=f1fcda53-1c32-5247-ba85-2ec31cc46829

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T120053+0100.btrfs
RECEIVED_UUID=407374e6-8aee-9742-8293-180906457a1b
RECEIVED_PARENT_UUID=cf48f6e0-6853-5644-ad46-a1023577e6c2

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T130053+0100.btrfs
RECEIVED_UUID=ed034d3d-4e1e-1a48-8816-d0e5f12427e6
RECEIVED_PARENT_UUID=407374e6-8aee-9742-8293-180906457a1b

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T140053+0100.btrfs
RECEIVED_UUID=db7e794c-d861-e947-accf-945f1e8745af
RECEIVED_PARENT_UUID=ed034d3d-4e1e-1a48-8816-d0e5f12427e6

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T150053+0100.btrfs
RECEIVED_UUID=c6bdf337-d156-8d43-bf4a-9dbb0d7e7e39
RECEIVED_PARENT_UUID=db7e794c-d861-e947-accf-945f1e8745af

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T160053+0100.btrfs
RECEIVED_UUID=6ae9415c-a125-0a40-908a-e079ebad6606
RECEIVED_PARENT_UUID=c6bdf337-d156-8d43-bf4a-9dbb0d7e7e39

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T170000+0100.btrfs
RECEIVED_UUID=9016b5b9-6640-be4c-bf70-a8ddca4b50e9
RECEIVED_PARENT_UUID=6ae9415c-a125-0a40-908a-e079ebad6606

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T180053+0100.btrfs
RECEIVED_UUID=963c4aa1-304b-1547-8e16-fda51f79f379
RECEIVED_PARENT_UUID=9016b5b9-6640-be4c-bf70-a8ddca4b50e9

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T190053+0100.btrfs
RECEIVED_UUID=7650d14c-a23a-054e-bff0-75b330dc797c
RECEIVED_PARENT_UUID=963c4aa1-304b-1547-8e16-fda51f79f379

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T200053+0100.btrfs
RECEIVED_UUID=7f4373d7-29c2-7142-a037-3947c88c8895
RECEIVED_PARENT_UUID=7650d14c-a23a-054e-bff0-75b330dc797c

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T210053+0100.btrfs
RECEIVED_UUID=280fa7f5-7182-a343-881c-c34719687d3c
RECEIVED_PARENT_UUID=7f4373d7-29c2-7142-a037-3947c88c8895

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T220053+0100.btrfs
RECEIVED_UUID=29eff396-c12c-b949-8023-4454f404a80f
RECEIVED_PARENT_UUID=280fa7f5-7182-a343-881c-c34719687d3c

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221112T230053+0100.btrfs
RECEIVED_UUID=78fd9a4f-8559-b648-bb8f-cae04c14c8f5
RECEIVED_PARENT_UUID=29eff396-c12c-b949-8023-4454f404a80f

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T000031+0100.btrfs
RECEIVED_UUID=de6455ef-da07-c74a-ae9f-ff7c536b848b

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T010021+0100.btrfs
RECEIVED_UUID=b0030025-e85a-b94e-b39d-3978d85df276
RECEIVED_PARENT_UUID=de6455ef-da07-c74a-ae9f-ff7c536b848b

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T020053+0100.btrfs
RECEIVED_UUID=80673bd6-c0e3-be45-a141-37d396f5146e
RECEIVED_PARENT_UUID=b0030025-e85a-b94e-b39d-3978d85df276

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T030053+0100.btrfs
RECEIVED_UUID=19c16656-0648-9c4e-ba25-778982381bbd
RECEIVED_PARENT_UUID=80673bd6-c0e3-be45-a141-37d396f5146e

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T040053+0100.btrfs
RECEIVED_UUID=2ef7189c-1959-2e47-9d13-dc5781a011b6

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T050053+0100.btrfs
RECEIVED_UUID=9954ed5f-44ae-6241-ba18-64a0a8f6c859
RECEIVED_PARENT_UUID=2ef7189c-1959-2e47-9d13-dc5781a011b6

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T060053+0100.btrfs
RECEIVED_UUID=dcab28f0-589f-e045-9d92-6c27c7509b0a
RECEIVED_PARENT_UUID=9954ed5f-44ae-6241-ba18-64a0a8f6c859

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T070043+0100.btrfs
RECEIVED_UUID=4f256570-d3be-3a42-b17f-330b1a1126a0
RECEIVED_PARENT_UUID=dcab28f0-589f-e045-9d92-6c27c7509b0a

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T080031+0100.btrfs
RECEIVED_UUID=acd579bf-42fa-e646-a5e9-e2d3086065ae

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T090053+0100.btrfs
RECEIVED_UUID=d6e28d44-856c-b848-b974-5b332d934bfb
RECEIVED_PARENT_UUID=acd579bf-42fa-e646-a5e9-e2d3086065ae

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T100031+0100.btrfs
RECEIVED_UUID=aeeca072-6dee-2640-a4c4-6f68a18c72d5
RECEIVED_PARENT_UUID=d6e28d44-856c-b848-b974-5b332d934bfb

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T110053+0100.btrfs
RECEIVED_UUID=a76ad315-85ce-ad4a-9a99-de9d4191bb1f
RECEIVED_PARENT_UUID=aeeca072-6dee-2640-a4c4-6f68a18c72d5

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T120053+0100.btrfs
RECEIVED_UUID=cf23a2ac-80dc-b542-b352-74cd549620d1

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T130012+0100.btrfs
RECEIVED_UUID=ed6e5dea-a97d-5f4d-8c26-88cbcd762f3f
RECEIVED_PARENT_UUID=cf23a2ac-80dc-b542-b352-74cd549620d1

-e -n #btrbk-v0.27.1
# Do not edit this file
TYPE=raw
FILE=data.20221113T140053+0100.btrfs
RECEIVED_UUID=f55d99f0-7065-9440-8259-4d528ebf4092
RECEIVED_PARENT_UUID=ed6e5dea-a97d-5f4d-8c26-88cbcd762f3f

calestyo avatar Nov 13 '22 15:11 calestyo

@digint Once the above would work... what are your general plans on how this should be ultimately handled in btrbk?

  1. Will it always be that people have to set up 1 cronjob/systemd-timer for the non-incremental sends and 1 cronjob/systemd-timer for the incremental sends, by themselves?
  2. Will some functionality provided within btrbk.conf that allows specifying when which type should be made?

If (1) then I think I could help with improving the systemd units to more easily allow this. We could either:

  • Have two pairs of .service and .timer and tell the users who to use/refine them
  • or have only 1 btrbk.service for both types, and one btrbk.timer out-of-the box and the users have to copy the .timer only to another btrbk-incremental.timer. With:
Unit=btrbk.service

we could make sure,that even the btrbk-foo.timer uses the btrbk.service. Now we'd only make sure that in that case it uses --override incremental=no. I haven't tried yet, but I hope that in the btrbk-incremental.timer unit we could set e.g. a envar, that the common .service could read out.

calestyo avatar Nov 14 '22 00:11 calestyo

I tried the same with version 0.32.5, but even with changes to incremental_prefs ... it never seems to delete any chains that are completely obsolete (i.e. all of he backups are already outside the retention policy).

I guess if this could be made working, we'd have a fully functional raw mode with rotated incremental backups, where every so often a full one is made.
And it would be always the duty of the user to make the incremental/full backups (via different sets of cron jobs or systemd times). And I think that would quite well fit the btrbk mode, where the config and btrbk itself always only controls the retention of snapshots/backups - but it's always the cronjob/timer who decides when these are made.

calestyo avatar Nov 16 '22 19:11 calestyo

I have implemented incremental raw deletion in the delete-incremental-raw branch: The scheduler will make sure that dependent parents are never deleted, and printed in the scheduler results as "preserve forced: parent of preserved raw target":

# ./btrbk  -c /tmp/btrbk_test/btrbk.conf run -n -S
[...]
BACKUP SCHEDULE
---------------
ACTION  SUBVOLUME                                                  SCHEME  REASON
delete  /tmp/btrbk_test/raw_target/svol.20221118T1902.btrfs    -       -
delete  /tmp/btrbk_test/raw_target/svol.20221118T1902_1.btrfs  -       -
delete  /tmp/btrbk_test/raw_target/svol.20221118T1903.btrfs    -       -
delete  /tmp/btrbk_test/raw_target/svol.20221118T1912.btrfs    -       -
-       /tmp/btrbk_test/raw_target/svol.20221118T1944.btrfs    -       preserve forced: parent of preserved raw target
-       /tmp/btrbk_test/raw_target/svol.20221119T1332.btrfs    -       preserve forced: parent of preserved raw target
-       /tmp/btrbk_test/raw_target/svol.20221119T1551.btrfs    -       preserve min: latest
[...]

And it would be always the duty of the user to make the incremental/full backups (via different sets of cron jobs or systemd times). And I think that would quite well fit the btrbk mode, where the config and btrbk itself always only controls the retention of snapshots/backups - but it's always the cronjob/timer who decides when these are made.

This is the obvious way rotation can be achieved. I'm aware that while it "fits the btrbk mode", this is a bit cumbersome to set up.

Because of this, implementing any configurable trigger for forcing non-incremental backups (e.g. monthly, or after N incremental) is hard to implement and somehow counter-intuitive, as it does not really fit into this pattern. Will need to put some more thoughts into this...

digint avatar Nov 19 '22 15:11 digint

😍 Will test it tonight.

Because of this, implementing any configurable trigger for forcing non-incremental backups (e.g. monthly, or after N incremental) is hard to implement and somehow counter-intuitive, as it does not really fit into this pattern. Will need to put some more thoughts into this...

As said above, I'd solve that via providing two systemd times, and some small explanation on how to override the times there... actually I found that really simple.
And it would place the information on when things run visibly where it belongs: systemd

calestyo avatar Nov 19 '22 15:11 calestyo

Will test it tonight.

Apart from the findings I've reported in some other issues (which were however likely anyway not really directly related to this), it seems to run fine so far.

Already expired backups are kept back to the last full dump, as long as any of the incremental ones are still required by the policy.

I'll let it continue to run for a while and report back should any issues arise... but looks like this could be merged :-)

calestyo avatar Nov 20 '22 23:11 calestyo

Still looks good.

But there may be some more complications as with incremental dumps in mind... things get actually quite tricky:
Let's assume an example where we just have retention for <n>d, where we do just one daily backup (no manual ones) and where these fall exactly at the time (e.g. 00:00) so that btrbk would consider it as the daily one.

If we now have 4d retention and and do a full one every 8d, then after the first 4 days, we'd need to keep (incremental) dumps for another 4 days until we get a new full one, but even then, we cannot rotate the old ones away, cause we need to keep 4d so we have another 4 days until we can finally throw away the first cycle.
It’s quite late already here, but I guess the formula would be <full-backup-period> + <retention period> what we keep at most, thus a <full-backup-period> extra.

If we now have 8d retention and and do a full one every 4d, then after the first 2x 4 days, we'd have two cycles of full+incremental backups, but it takes us another 4d, till we can throw away the 1st (i.e. the oldest) one of the two previous.
So again, as above we always keep one <full-backup-period> extra.

Not sure what happens if we'd have just e.g. weekly retention, and the full dumps don't fall on the week starts. Probably as above + some extra shift time.

What if one has a more complex retention policy, e.g. something like 4d 2w 2m.

What would that mean? AFAIU, on 2022-07-31 (which was a Sunday) at 01:00 one would have more or less backups form these dates:

  • 4 days back 2022-07-31 00:00 2022-07-30 00:00 2022-07-29 00:00 2022-07-28 00:00
  • 2 weeks back 2022-07-25 00:00 2022-07-18 00:00
  • 2 months back 2022-07-01 00:00 2022-06-01 00:00

Which child parent relationships would one want? I guess something like:

  • 4 days back 2022-07-31 00:00 child of 2022-07-30 00:00 2022-07-30 00:00 child of 2022-07-29 00:00 2022-07-29 00:00 child of 2022-07-28 00:00 2022-07-28 00:00 child of <day before, which need t be kept around extra, and so on till 2022-07-25 00:00>
  • 2 weeks back 2022-07-25 00:00 full 2022-07-18 00:00 full
  • 2 months back 2022-07-01 00:00 full 2022-06-01 00:00 full

or maybe rather?:

  • 4 days back 2022-07-31 00:00 child of 2022-07-25 00:00 2022-07-30 00:00 child of 2022-07-25 00:00 2022-07-29 00:00 child of 2022-07-25 00:00 2022-07-28 00:00 child of 2022-07-25 00:00
  • 2 weeks back 2022-07-25 00:00 full 2022-07-18 00:00 full
  • 2 months back 2022-07-01 00:00 full 2022-06-01 00:00 full

This would right now of course not be possible, because it always picks the best closes one as parent.

Or if one would want more backups in the weekly and/or monthly range, one might want to use incremental dumps there as well... e.g.

  • 4 days back 2022-07-31 00:00 child of 2022-07-25 00:00 2022-07-30 00:00 child of 2022-07-25 00:00 2022-07-29 00:00 child of 2022-07-25 00:00 2022-07-28 00:00 child of 2022-07-25 00:00
  • 4 weeks back 2022-07-25 00:00 child of 2022-07-18 00:00 2022-07-18 00:00 full 2022-07-11 00:00 child of 2022-07-04 00:00 2022-07-04 00:00 full
  • 6 months back 2022-07-01 00:00 child of 2022-06-01 2022-06-01 00:00 full 2022-05-01 00:00 child of 2022-04-01 2022-04-01 00:00 full 2022-03-01 00:00 child of 2022-02-01 2022-02-01 00:00 full

So quite some care would need to be taken, that one doesn't need to keep too many dumps, just because of dependencies.

calestyo avatar Nov 22 '22 05:11 calestyo

Just one idea for the records...

Imagine, one does weekly full backup, e.g. on Mondays and one wants a retention policy like: 4d 3w 2m (also taking Monday as start of the week).

If I'm not wrong, one will, because of incremental backups and their parent dependencies, have up 7 + 4 days of dumps (just for the daily backups), because only after the 2nd week's full dump, older ones could be deleted, but then it take 4 further days, till the 4d is fulfilled

If there are not many writes, then I'd say keeping these "extra" dailies (11 vs. the desired 4), shouldn't be a big problem. If there are however so many writes... then it may become more expensive to keep the extra 7 days of incremental changes, than to have just 4 days of full backups (and don't do incremental at all).

I guess there's not really that much btrbk could do about it... unless perhaps doing some bookkeeping, in order to tell people what would be more efficient for them.

The idea here would be, that it keeps cache information about how much storage the incremental dumps cost, vs. how much full ones would cost (for the time of the retention periods).

So it could tell the user - for the current IO behaviour - that it would be cheaper for him to do full dumps only instead of incremental ones (with the penalty of having to keep much more).

calestyo avatar Nov 27 '22 16:11 calestyo

A 2nd idea, that build right on the previous example:

Assume we'd conclude, that the incremental is in fact cheaper (in terms of space).

Then we'd now have:

  • 4 dailies + 7 extra dailies
  • but also the 3w 2m needs to be fulfilled

If the times for the full dumps are wisely chosen, then we already should have full dumps at all these weekly and monthly backups, so nothing extra we'd need to keep for them (to have a complete chain).

But this also means, that we'd have to keep 3+2 full dumps, which again is "efficient" if there were maaaany writes - but less efficient, if there were only few (which is typical for stable hosts like servers, at least for their system disks).

One way (which I already mentioned before), would be to make make incremental dumps with different parents. E.g. the one for 2 months ago, would be a full one, the one for 1 month ago would be a incremental one, base on the full one from 2 months ago.
Problem obviously is: super many extra dumps, we need to keep, until things can get rotated away.

So an idea here would be - though I guess it would be pretty complex to automatise this - to use "reverse" incremental dumps:
E.g. for the monthlies: if we create a new one that is a monthly one, than we'd have 2 previous monthlies. The oldest one from 2 months ago can now in principle be deleted, but the one from 1 month ago (which would be a full dump) would need to be kept for the retention policy.
To save space, create a new "2nd" backup for one from 1month, but this time an "reverse" incremental from the new monthly just created. Once that has finished, we can delete the much bigger full dump for last month.

Obviously, it again depends on the IO scenario, whether this is (space wise) more efficient or not.

And it becomes really complex, if one wants to do it for many months back:

  • either one could make the reverse incremental dumps form another chain,...
  • or one could make each a child of the "current full one".

In any case, as soon as the "current full one", on which they're all based, would need to be rotated away, one would need to rewrite all. But since the current full one is likely always newer, that should in practise never happen, and the reverse incremental ones should go, without any need for rewrites.

calestyo avatar Nov 27 '22 16:11 calestyo

Because of this, implementing any configurable trigger for forcing non-incremental backups (e.g. monthly, or after N incremental) is hard to implement and somehow counter-intuitive, as it does not really fit into this pattern. Will need to put some more thoughts into this...

@digint what do you think of my proposal from https://github.com/digint/btrbk/issues/474#issue-1226691072? I think we just need to provide some option(s) to tell btrbk how to choose the parent of a given backup. If we use scheduler logic for that, we can end up with a nice "tree" of backups which minimizes duplicated storage. It doesn't seem too counter-intuitive to me.

sbrudenell avatar Jan 04 '23 05:01 sbrudenell

@calestyo I appreciate your involvement, but this issue was about a specific proposal to solve this problem. Now 90% of the text is your tangential thoughts and walls of paste. Could you please consider creating new issue(s) and/or learning about collapsible markdown sections

sbrudenell avatar Jan 04 '23 05:01 sbrudenell

I'm afraid I did not yet have much time to wrap my head around this. I'm hesitating a bit to implement too much special logic for incremental raw targets:

  • It's probably not straight-forward to implement (the core functionality is to pick meaningful parents for "normal" non-raw backups, and everything else probably needs quite some rewrite)
  • Possibly many different use-cases to be considered (tbh I did not yet thoroughly read all comments here, but it seems that the requirements of @sbrudenell and @calestyo (and mine) differ quite a bit, and while the proposal in OP makes sense to me, it might not be enough for all cases.

I definitely see the need of having some sort of special parent-selection for raw targets in order to make this feature useful, and @sbrudenell 's proposal of "scheduled parents" makes sense to me. I'll have a look into it after I finished the "action-cp" stuff I'm working on (sadly I don't have much time to work on btrbk these days).

At least the deletion of incremental raw targets seem to work fine, this will go to next major btrbk release (along with action-cp).

digint avatar Jan 09 '23 00:01 digint

Merged delete-incremental-raw branch into master: 61691abbfc8fc1351908ce8a97d5e2908dc8d9b0

Leaving issue open until included in release.

digint avatar Jun 07 '23 21:06 digint