Iceberg papercut
When we do not enter info and press ok :)
can argparse do that?
Actually no, but we could use the syntax --keep n# number if we instruct argparse to consume two arguments. Maybe add_argument('--keep', nargs=2, type=str, help...) could work. Then we do the parsing as for --keep-within.
Yeah, one new option, 2 arguments:
- interval specification (must be able to represent all intervals we use, 1s .. <N>y)
- how many "last backups" to keep in the given intervals
That would generalize the idea of what currently is using a lot of option names.
Would you like to do a PR against master for that?
I think that people new to borg don't realize how space-efficient the historical backups are, and so try to overly optimize the pruning policy. In practice, the current options seem to work quite well, so I would be wary about adding further complexity to a topic that is already hard to explain.
@jdchristensen, I think you're absolutely right about the backups dimensions. Nevertheless, maybe with a better explanation, the syntax I proposed could simplify both the code and the comprension, in addition to providing great flexibility.
I propose here a comment for the option --keep #s N:
Borg splits the past in chunks as specified by #s (e.g. 1w means 1 chunk = 1 week) and keep the latest backup of each chunk, until you reach the count of N. Here, # is a numer and s a character in the set {h,d,m,w,y}.
Maybe my explanation is awful, in that case I think that also the one that @ThomasWaldmann gave is simple and straight to the point.
The flexibility in the pruning policy could help to clarify the archives list. Take for example the situation discussed above: to get the same results I would have to have 12 backups per year instead of 4. If you multiply this for different backups it ends up creating many more archives.
In borg context, we use "chunk" in a completely different meaning, so maybe rather say "interval". A usual shorthand for a time interval is dt (delta t).
I thought about --keep #s N some time. --keep 1m 6 would clearly be identical to --keep-monthly 6, but what is the meaning of let's say --keep 5d 4 ?
- "Starting today, keep every fifth daily backup (TODAY, TODAY-5, ... TODAY-15)"?
- Or does it mean "TODAY-(5,10,15,20)"?
- Or does it mean "Save the backup of every day number that is divided by 5 without remainder"?
I thought about --keep #s N some time. --keep 1m 6 would clearly be identical to --keep-monthly 6, but what is the meaning of let's say --keep 5d 4 ?
1. "Starting today, keep every fifth daily backup (TODAY, TODAY-5, ... TODAY-15)"? 2. Or does it mean "TODAY-(5,10,15,20)"? 3. Or does it mean "Save the backup of every day number that is divided by 5 without remainder"?
Probably I wasn't clear as I hoped. I'll try to visualize this concept with your example: --keep 5d 4.
This is what I meant with time intervals (thanks @ThomasWaldmann):
| Today -1 -2 -3 -4 | -5 -6 -7 -8 -9 | -10 -11 -12 -13 -14 | -15 -16 -17 -18 -19 | ...
Let's say you have the following backups:
| Today -2 -3 | -6 -7 -9 | -10 -11 -13 -14 | -16 -17 -18 -19 | ...
Then you should keep the latest for each dt, so: Today, -6, -10, -16
Provided that you have all the daily backups it would keep Today, Today-5, ..., Today-15.
Maybe the visual example could be inserted also in the command description.
There's also some subtlety here that might not be obvious. For example, --keep-weekly uses Monday-Sunday weeks, --keep-monthly uses calendar months and --keep-yearly uses calendar years. So the new --keep 7d 1 will probably be different from --keep 1w 1. I think it makes sense that these options don't count back from the current day, since it makes the pruning decisions stable. On the other hand, --keep-within 1m counts back 31 days from the current day.
Fixed in 1.4-maint. ^^