borg2: consider dropping --exclude and --exclude-from
In borg 1.4 there are two distinct sets of options that can be used to define excludes:
--exclude and --exclude-from
AND
--pattern or --patterns-from
To cite borg documentation:
A more general and easier to use way to define filename matching patterns exists with the --pattern and --patterns-from options. Using these, one may specify the backup roots, default pattern styles and patterns for inclusion and exclusion.
From a user's point of view: --exclude style seems to provide only a subset of --pattern style. Both (--exclude and --pattern) make use of patterns - but have different defaults - which can be confusing. It is not clear why --exclude style exists.
I just wondered if borg2 would provide a good opportunity to drop --exclude and --exclude-from and to only use --pattern and --patterns-from, in the future. But maybe I just miss important differences between both set of options.
@goebbe Yeah, good idea, needs to be considered.
Some additional points:
- users (and borg UIs/wrappers, like borgmatic, pika backup and vorta) likely use both. so dropping the --exclude is possible at a breaking release, but causes some work for these users/developers.
--exclude[-from]has simpler argument / file content as it is already clear that the given patterns are excludes ONLY and also of some specific matching style, while for--patterns[-from]this has to be specified per pattern or once in the input file.- quite some users are not too happy with the excludes/patterns and their docs. it's quite complex. so the question is also whether it should be replaced completely.
quite some users are not too happy with the excludes/patterns and their docs. it's quite complex. so the question is also whether it should be replaced completely.
With respect to the docs, for me, the main confusion came from the mixing of --exclude[-from] and --pattern[s-from], different defaults, and (for beginners) overwhelming amount of options and possibilities. Once I settled to only use --pattern[-from] most of the confusion was gone. Good commented examples are very helpful.
I could offer some help rewriting this part of the docs.
Some thoughts on the confusing parts in the current docs/ implementation:
- the syntax with respect to the path (with respect to root(?) can be confusion) - I started to always put the full path and this works well (for me).
- users want to add an include (in an existing --exclude-from)
- the first "/" in a path should be omitted - this seems to come from the internals - but may be uncommon/ strange for a user? There are all this "A leading path separator is always removed." comments - that may be hard to understand to a newcomer.
- path that includes all files and subfolders vs. path that includes only subfolders?
- defining roots in the --pattern[-from] when the root is already defined in the borg create line can lead to errors/ confusion
- I wonder if there are heavy users of path full- matcher (pf:) I understand the intention, but wonder if the gains are worth the implementation/ maintenance/ mental overhead (compared to the use of ! )
- Are there use-cases for fm: that are not covered by sh: ?
Excludes / includes / patterns seem to be an important concept for a backup specification (create?) I would rather dedicate an own section to it instead of putting it into Miscellaneous Help.
Here are some side-by-side examples with equivalent matches using --exclude[-from] and --pattern[s-from] using the fm: matcher:
In borg 1.4 the following should be equivalent:
borg create --exclude home/bobby/trash --exclude 'home/bobby/*junk' --exclude 're:^(dev|proc|run|sys|tmp)' repo::arch /home/bobby
and
borg create --pattern=-fm:home/bobby/trash --pattern='-fm:home/bobby/*junk' --pattern='-re:^(dev|proc|run|sys|tmp)' repo::arch /home/bobby
Note: Depending on the required match, with --pattern the default sh: matching might often be sufficient - in this case the fm: can be omitted from the syntax.
Equivalents, using pattern-files:
All examples are without R roots (source):
With --exclude-from :
home/bobby/trash
fm:home/bobby/*junk
re:^(dev|proc|run|sys|tmp)
Note: Explicit fm: per line is not necessary, here, since with --exclude-from fm: is the default matcher (the example with explicit fm: per line has been included, since this style is e.g. used by Vorta)
Equivalent, using --patterns-from, setting fm as default, at the start of the file :
P fm
- home/bobby/trash
- home/bobby/*junk
- re:^(dev|proc|run|sys|tmp)
Or equivalently (again using patterns-from, but setting fm: per line):
- fm:home/bobby/trash
- fm:home/bobby/*junk
- re:^(dev|proc|run|sys|tmp)
Another difference between --patern and --exclude is that --exclude has a shortcut -e. Currently, there is no shortcut for --pattern.
With --exclude, instead of the long version:
borg create --exclude home/bobby/trash --exclude home/bobby/*junk --exclude 're:^(dev|proc|run|sys|tmp)' repo::archive /home/bobby
One can use the shorter version:
borg create -e home/bobby/trash -e home/bobby/*junk -e 're:^(dev|proc|run|sys|tmp)' repo::archive /home/bobby
As currently formulated --exclude is broken. It either needs to be reworked or removed. I vote for reworked since it should provide a simple and easy to understand interface for beginners.
The biggest problem is that relative path handling is broken.
- If the root path is absolute then relative patterns are treated as absolute. This is dubious when the root path isn't
/but matches the documentation. It could reasonably be expected for relative patterns to be relative to the root path. - If the root path is relative then absolute patterns are treated as relative to the current directory. This isn't sane under any circumstances and also contradicts the documentation when the root path starts with
../. The documentation states that leading../are stripped before matching, which doesn't happen.
Suggestions:
- Change the default pattern to
sh:- I posted a separate request for dropping
fm:support #8659
- I posted a separate request for dropping
- Add a matching
--includeoption. - Replace the '??:' prefix with
--sh,--re, and--ppoptions that set the default for following command line patterns.- This allows --exclude-from to safely use untrusted file lists that should only contain plain paths.
- Make relative patterns relative to the root path. If more than one root path is specified then they should be relative to each root independently. (If we exclude 'foo' and our roots are '/home/a' and '/home/b' then we should exclude '/home/a/foo' and '/home/b/foo'.)
- Make absolute patterns actually absolute even when the root path is relative.