Borg patterns documentation is very confusing
Just to bring to your attention that as a new user, the patterns documentation is still confusing in many places and it might need a rewrite.
The path/filenames used as input for the pattern matching start from the currently active recursion root.
"Currently active" as in the one that was specified on the command line? Or currently active withing a patterns file?
You usually give the recursion root(s) when invoking borg
"usually" implying it is possible to give them somehow else? How then? If I put a root into a patterns file, but not include it in the command line, no files from that root are ever included. So it has to be included on the command line? Then why does it say "usually"?
can be either relative or absolute
Relative to what? To current directory? To the path given on the command line? To the "root" path written in a patterns file?
A more general and easier to use way to define filename matching patterns exists with the experimental --pattern and --patterns-from options. Using these, you may specify the backup roots (starting points)
"roots", so it means there can be several roots inside the patterns file? What does it mean if there are several roots? Are excludes processed for the last "root" path in that exclude file? Then why are they not relative to that root path? Also what is the point of specifying a "root path "R"" in a patterns file at all? What is it's functionality when you have to specify each include/exclude as an absolute path anyway? The documentation says it can be "relative", but it certainly doesn't look like it works to be relative to the root path in that patterns file, it might be relative to the current directory? Then why write any root paths at all? Are they simply limiting the scope somehow? In what situation?
Using these, you may specify the backup roots (starting points)
"you may", so one does not have to include them in a patterns file? Is at least one required? If they are not required, and adding them (without adding same root on the command line) does not include any of the files from that root, then what is the point in them at all??
The only example in the docs writes a root path of "R /". Surely looks like a poor choice of a root to demonstrate any abilities (are there any? Is there any reason for these root paths in pattern files at all?), because every path in the entire system is going to be inside this root path...
"currently active" refers to a borg create run, processing one recursion root after the other...
IIRC, they can come from commandline args as well as from R lines in the patterns file. There should be no need to give them twice (cli + patterns), that might be a bug. Please check and file a bug if you can reproduce.
relative means relative to cwd.
You can give multiple roots in a patterns file and it means the same as giving multiple roots on the cli (and I guess you can even do both, although that would be a bit messy). The point of having them there is that you can define all stuff relating to "include that, exclude this" there at one place.
It only does what the docs say (at least I hope so), not what you read between the lines or what you wish it did. :-)
That clarifies it a lot, thank you.
There should be no need to give them twice (cli + patterns), that might be a bug In my experience there is no need to add them twice, but they don't do anything if they are in the patterns file only. If I add a root in the patterns file, it is simply not processed at all... I'll see if I can try to recreate it.
As an additional idea that would be useful is to have the include/exclude rules relative to the last root specified in the patterns file. That way the patterns could be greatly simplified because every line of them would not need to be absolute or relative to cwd, they could be patterns inside that root.
I think I figured out to just keep everything absolute and then it works.
It still seems that the section could benefit from being written in a more user-friendly way, I just wanted to report this so that you guys at least know that it might be an issue.
The path/filenames used as input for the pattern matching start from the currently active recursion root. You usually give the recursion root(s) when invoking borg and these can be either relative or absolute paths.
"input for the pattern matching"? The input to the program, like the lines inside the patterns file? Or does this refer to the internal flow structure inside Borg, like how the software is built? In that case it is probably misleading for the end user who should not concern themselves how it is built, but instead with how it works.
Then it goes on to explain difference between absolute/relative in so many words when in reality it has just the same meaning as anywhere else in the OS? Why explain it at all? The confusion comes because if there is such a big section to explain this, one assumes there is some new information there.
This is meant when we talk about "full path" below.
What does this refer to? What is meant by "full path", what does the word "This" refer to here? The absolute path? Why not just call it "absolute". Or does the "full path" mean either the relative or the absolute based on how the root itself looks?
Do we need a space between R/-/+ and the path? Can it be many spaces? It seems it should be specified.
Also as it is now, does every single pattern apply to every single backup root? Or did I misunderstand this from the docs? So there is no way to have specific patterns that apply only to one backup root of many? It seems beneficial at least, but if this is the case, then it is not an issue of documentation, I was just unclear about this from reading it.
Not sure if this should be its own issue.
Path full-match, selector pf: [...] pf:root/file.ext matches root/file.txt only.
https://github.com/borgbackup/borg/blob/29aec480c0eff0629e2384641e44a07b300679ab/docs/man/borg-patterns.1#L95
Is this written correctly? It looks like both of those should say root/file.ext, but in fact they are different: one of them reads root/file.txt.
Would be cool if there was some kind of pattern tester.
@cloudrac3r good catch!
this is the place to fix it - do you want to make a pull request (against 1.1-maint and master)?
https://github.com/borgbackup/borg/blob/1.1.10/src/borg/archiver.py#L2231
(the docs, man pages, etc. will be automatically generated from that, no need to change them manually)
@achikin that is what borg create --list --dry-run ... is intended for.
@ThomasWaldmann:
that is what
borg create --list --dry-run ...is intended for.
This info would be a nice addition to https://borgbackup.readthedocs.io/en/stable/usage/help.html.
@miguno Good idea - do you want to make a pull request against master and 1.1-maint?
PR up at https://github.com/borgbackup/borg/pull/4882
As an additional idea that would be useful is to have the include/exclude rules relative to the last root specified in the patterns file
It wasn't explicitly stated by others if the order of roots in a patterns file matters. The answer to this appears to be "no", however, based on this explanation from @ThomasWaldmann:
You can give multiple roots in a patterns file and it means the same as giving multiple roots on the cli
It is a bit misleading to me that I can put roots between patterns and it just structurally appears like patterns are associated to the roots above it. Does this mean there's no way to specify patterns specific to only certain roots? For example, I may want to ignore *.txt files in /my/path/one but not /my/path/two.
IIRC, roots and patterns have no relationship (esp.: the patterns are not relative to the root).
As patterns use full paths you can still do anything you like.