adlibre-backup
adlibre-backup copied to clipboard
Enable weekly, monthly, quarterly and yearly retention schema
- new config properties EXPIRY_WEEK, EXPIRY_MONTH, EXPIRY_QUARTER and EXPIRY_YEAR
- longest period wins (when first day of year is also first day of week then EXPIRY_YEAR is selected)
- support infinite retention with value 0
- explicitly use /bin/bash instead of /bin/sh to avoid script errors when bash is not default shell
Nice bit of code. I previously decided not to implement complex expiry logic because there isn't a lot of benefit with CoW filesystem snapshots, the overhead is so low that keeping many years of daily backups isn't big deal. (But you've written it, and it seems fairly sane, so I'm prepared to merge it.)
Corner cases are harder to deal with when you have a complex expiry method. What happens if the backup isn't run on a particular day, or fails? Do we end up with holes in the backup rotation? Sophisticated expiry schemes can handle these cases.
Can you please provide a bit of documentation regarding the the config options. Eg explain are the expiry options additive? ie in your default config keep 8 days of daily, 25 weeklys?
Also I think if we're going to ditch Bourne shell compatibility we should change the hash bang to a compatible #! /usr/bin/env bash. Can you please add this to your PR.
Thanks
Hi Andrew,
I agree that under error conditions this very simple implementation can't guarantee that you will not have holes in the retention.
As nobody is forced to execute the prune script, I think the proposed solution allows everyone to pick a strategy according to their personal level of paranoia ;-)
Unfortunately I can't say yet if I btrfs's COW is sufficient that I will never run into free disk space problems. I've just setup my first backup box with btrfs -- so still in learning mode.
Another reason why I've expect that getting rid of (hopefully)
unnecessary snapshots might become usefull is more performant ways to
find distinct versions of a given file. Assuming I do multiple backups a
day and keep them forever, the time it takes find to go through all
snapshot trees to find different versions of a file might become quite
long. For me the ability to quickly find and identify unwanted/erroneous
modifications/deletions of files somewhere in the backup history is an
important use case. I'm still hoping to find some btrfs features that
would allow me to identify distinct file versions accross many snapshots
in a more efficient way than "find" (btrfs should now when things have
changed), but so far I haven't.
I've given the documentation a try by adding a section to the README.md -- although native speaker review might be needed.
You might have misinterpreted my default expiry setting. I've just use some $(eval..) expressions to show multiplies for days for week- (7 * w) or year- (365 * y) ranges. But nothing adds up. Amongst all non-empty configuration options that would match for a given day, the one for the longest time period gets applied.
I've changed /bin/bash to /usr/bin/env bash
Let me know what you think of my changes.
Best, Andreas
Am 22.04.2015 um 04:16 schrieb Andrew Cutler:
Nice bit of code. I previously decided not to implement complex expiry logic because there isn't a lot of benefit with CoW filesystem snapshots, the overhead is so low that keeping many years of daily backups isn't big deal. (But you've written it, and it seems fairly sane, so I'm prepared to merge it.)
Corner cases are harder to deal with when you have a complex expiry method. What happens if the backup isn't run on a particular day, or fails? Do we end up with holes in the backup rotation? Sophisticated expiry schemes can handle these cases.
Can you please provide a bit of documentation regarding the the config options. Eg explain are the expiry options additive? ie in your default config keep 8 days of daily, 25 weeklys?
Also I think if we're going to ditch Bourne shell compatibility we should change the hash bang to a compatible |#! /usr/bin/env bash|. Can you please add this to your PR.
Thanks
— Reply to this email directly or view it on GitHub https://github.com/adlibre/adlibre-backup/pull/9#issuecomment-94999612.
I've given this some fresh thought, and I think the way I'd like to see this implemented is via arbitrary tags. The tags would be added to the backup when it is taken. And they are used to filter the action of the backup pruner.
To implement this the logic would go into the backup runner. It would be responsible for placing tags on backups.
See #21 as a simpler approach to flexible retention