borg
borg copied to clipboard
Multiple relative paths for `borg create`
Docs at https://borgbackup.readthedocs.io/en/stable/usage/create.html say:
Paths are added to the archive as they are given, that means if relative paths are desired, the command has to be run from the correct directory.
And further down, there's an example:
# Backing up relative paths by moving into the correct directory first
$ cd /home/user/Documents
# The root directory of the archive will be "projectA"
$ borg create /path/to/repo::daily-projectA-{now:%Y-%m-%d} projectA
Now, if there is a second folder projectB
in the same directory, that wouldn't be a problem. But is it possible to add two relative paths projectA
and projectB
if they're in different directories like /home/userA/projectA
and /home/userB/projectB
?
- I can't be in both
/home/userA
and/home/userB
at the same time, so I can't just add bothprojectA
andprojectB
. - Also, I don't think it's a possibility to first cd to
projectA
, runborg create
, cd toprojectB
andborg create
there into the same archive again, right? - I might be able to use symlinks? I will have to look further into that.
No, that is not possible (well, maybe you could use bind-mounts or something to artificially create the structure you want). symlinks won't helps as they are archived as symlinks (not followed).
But this might not be desirable anyway, as it is confusing.
What you could do is to go up the fs tree structure until your reach a common parent dir or just directly go to /
and work relative to there (and add excludes for what you do not want).
But this might not be desirable anyway, as it is confusing.
Why? :thinking: Am I overlooking something? Why would it be necessary for the backup to reflect the original file system directory structure if it's only empty parent folders leading up to projectB
?
maybe you could use bind-mounts
That worked perfectly. Thanks! :+1:
Well, it is helpful if the backup archive has helpful pathes. If you'ld just pick a lot of stuff from misc locations and put it into an archive (without some leading path), you might get into trouble at restore time, trying to remember what was where.
I think it would help in order to make long-term backup archives more system-independent and more robust to changes.
With TAR archives, we have the -C option to change directories right before backing up a folder, so you can do:
tar czvf /tmp/test.tgz -C /home/userA projectA -C /home/userB projectB
And then restore using:
tar xzvf /tmp/test.tgz -C /home/userA projectA -C /home/userB projectB
If I later change the locations of these projects, I can simply update my backup/restore script and still restore old archives, because they didn't include any leading paths to begin with. Path structure is usually extremely important for projects, but not the leading path where that project was located at the time.
Other workarounds might include:
- Using TAR without compression for backups and pipe into Borg, probably making restores more cumbersome (bad idea?)
- Using multiple repositories (one for each project)
- Using a single repository but backing up into separate archives using a prefix (also interesting for pruning using --prefix)
- Using a single repository and single archive, but include a locations.txt file in the backup which contains the entry-points of backed up project folders and then strip-out the leading paths at restore time
I would love an option comparable to TAR but, of course, the workarounds do work. I think Borg is the holy grail of Linux backups and nearly as perfect as it can be. Thanks for giving us such a wonderful tool!
That sounds like a reasonable suggestion, so I am reopening this so we can check whether this can be implemented.
It looks like we can't implement tar syntax and behaviour here (like -C /home/userA projectA -C /home/userB projectB
).
Reasons:
- blocker: argparse
parse_args
does not support intermixed options and positional args. there isparse_intermixed_args
, but it does not support other argparse stuff we use. - because of this, it could be only like
borg create --cwd /home/userA --cwd /home/userB repo::archive projectA projectB
, which is rather ugly and limited. - trivial: we already use
-C
for compression (that would not hold us back from just using--cwd
or so though)
Of course one could do misc. special hacks, like first using parse_args
with argparse.REMAINDER
to catch -C /home/userA projectA -C /home/userB projectB
and then use another parse_intermixed_args
within do_create
, but that would collide with our nicely automated help, manpage and docs generation.
A hack to implement the same idea could be to use some special separator within the PATHs, like:
borg create REPO::ARCHIVE /home/userA::projectA /home/userB::projectB
Presence of ::
inside a path would split it into CWD and PATH and then act accordingly.
Problematic if you have an actual recursion root path that contains ::
.
Also, users might get confused due to the different meanings of this separator when used for REPO/ARCHIVE and for CWD/PATH.
Yeah, I've always wondered why some tools allow appending options despite the syntax saying options before arguments.
I find the idea of a separator intriguing. Maybe let the user specify the separator using a special option?
borg create --path-prefix-separator % REPO::ARCHIVE /home/userA%projectA /home/userB%projectB
So you can shift responsibility for edge cases to the user, similar to HEREDOC. And only users who want to use the feature need to be aware of the implications.
In that case, it may be worthwhile to ditch the CD aspect. Tar makes it easy to understand because it's the same as manually typing CD between commands. But that also allows you to do weird relative/contextual stuff like -C /home/userA projectA -C ../userB projectB. (confusing, breaks when reordering operations).
Keeping a fixed current directory and then stripping path components dynamically (using string operations) would probably solve it just as well and also make it more akin to (but more flexible than) borg extract --strip-components NUM. And the help wouldn't need to cover both the separator and CDing between paths (two concepts vs. the one in Tar).
I would also appreciate this for "prettier" archive structures. Here's another possible implementation, taken from nothing less than man rsync
:
It is also possible to limit the amount of path information that is sent as implied directories for each path you specify. With a modern rsync on the sending side (beginning with 2.6.7), you can insert a dot and a slash into the source path, like this:
rsync -avR /foo/./bar/baz.c remote:/tmp/
That would create /tmp/bar/baz.c on the remote machine. (Note that the dot must be followed by a slash, so "/foo/." would not be abbreviated.)
That "modern" rsync 2.6.7 was released in 2006 by the way, here's the original commit: https://github.com/WayneD/rsync/commit/d2ea5980ba7986ddd583b4f55737eb56a0ed66a6
@rovo89 Oh, that's an interesting, slightly dirty hack. :-)
It's somehow similar to my idea back then: https://github.com/borgbackup/borg/issues/4685#issuecomment-1007716496
But, as it uses the unusual, but "NOP" ./
as the separator, it does usually not have the problem that this accidentally appears in a path (although there could be weird circumstances where it does, e.g. because the user gave it that way without wanting to trigger this functionality / without realising that it triggers this functionality).
@ThomasWaldmann Oh wow, I didn't realize that you already implemented this. Thanks! 🙂 I added a comment (https://github.com/ThomasWaldmann/borg/commit/5b96d5acc30fec766a076ec367a154497b5d52e4#r138269214) regarding the edge case with multiple /./
occurrences.
Also, are there any plans to port this to 2.0 as well? Sorry if it's a dumb question, I briefly looked for a statement about intended feature parity but couldn't find any.
@rovo89 yes, i tagged #8060 with port/master. just want to first collect all feedback, see also the 1.4 thread on github discussions.
Cool, thanks! I didn't notice either, so thanks for the pointers.
OK, I'll summarise the ideas here (I always used /./
as separator here):
- SP "strip prefix"
/strip/prefix/./keep/postfix
- implemented by #8060 - SM "strip in the middle"
/keep/prefix/./strip/this/./keep/postfix
- RP "replace prefix"
/find/this/prefix/./replace/that/prefix/./keep/postfix
SP
The nice thing here is that the given path is precisely the source path we want to back up, except that it has that NOP /./
inside (which just vanishes after the normalisation we do anyway).
SM
This is a bit more powerful than SP, because one may keep a leading part of the path.
The given path is precisely the source path, except that is has 2 NOPs inside, which vanish after normalisation.
RP
This is the most powerful idea, because it can translate the fs path prefix to a different archive path prefix. But we need to do more processing there, not just normalisation.
Instead of the syntax given above we could also use the more friendly, but also more risky syntax /replace/that/prefix/:/find/this/prefix/./keep/postfix
. When splitting at the :
separator, the rhs would be precisely a source path (including the to be found prefix), the lhs would be the replacement prefix.
It's more risky because :
is a valid character in UNIX filenames. We could also use ::
to reduce bad detection of it.
RP is a superset of SP if we consider the replacement part (and separator) optional, defaulting to the empty string as replacement.
RP is also a superset of SM, because we can find a longer path prefix and replace it with a reduced version of it.