Option to follow symlinks _in path arguments_
Output of restic version
restic 0.9.6 compiled with go1.13.4 on linux/amd64
What should restic do differently? Which functionality do you think we should add?
I would very much like for restic to have an option to follow symlinks solely within the paths passed to restic.
(This resembles https://github.com/restic/restic/issues/2211, but it is nearly unrelated. And it overlaps with https://github.com/restic/restic/issues/542, but this request is focused in a way that issue is not; if I should've posted this in there, please let me know.)
Right now, if I do...
$ restic backup /path/to/a/symlink_to_a_dir
or
$ restic backup /path/to/a/symlink_to_a_file
...I get a snapshot containing a single symlink, which makes sense.
I would greatly value being able to do something along these lines...
$ restic --follow-arg-symlinks backup /path/to/a/symlink_to_a_dir
...and get a snapshot named with the path "/path/to/a/symlink_to_a_dir" but with the contents of /path/to/the/real_symlinked_dir
The concept is very much like this find option:
-H Do not follow symbolic links, except while processing the command line arguments.
What are you trying to do?
To be clear, this isn't at all about following symlinks within a backup fileset (I grasp why that's a big can of worms); this is solely about optionally following symlinks to find to the root of the backup fileset.
One of my use cases is that I use urbackup to archive some local clients to the server I'm backing up with restic, and it manages its backup "snapshots" over time as new folders containing both changed files and hardlinks to unchanged files, and always keeps a current symlink pointing to the latest backup.
Since I'm only interested in backing up that current backup, it would be valuable to be able to simply point restic to it with a --follow-arg-symlinks (or similar) option.
Right now, I'm scripting my way around that, but the downside is that each new restic backup is a snapshot named with a different path; what I want is simply a series of snapshots named with the /path/to/urbackup/client/symlink_to_current path, but with the contents of the directory referenced by the symlink.
Did restic help you today? Did it make you happy in any way?
restic helps me nearly every day. There are a few rough edges yet to file down and things I pray it will someday be able to do, but I depend on it regularly. I'm pleased and very thankful.
Does this do what you're after?
find /path/to/urbackup/client/symlink_to_current/ -maxdepth 1 -print0 | xargs -0 restic backup
Thanks for that, but I'm currently doing the equivalent in my automation script.
What I don't like is that it dereferences/follows the symlink, and every time I run a backup, I end up with another snapshot of a new and forever-changing path.
That is, I end up with this:
# restic snapshots
ID Time Host Paths
52914656 2020-01-26 09:37:37 Server /path/to/urbackup/client/0129e9ff-0830-495f
4a0b9a9d 2020-01-27 08:48:01 Server /path/to/urbackup/client/42f49bbc-9f9d-4383
a8d8ce4a 2020-01-28 09:32:13 Server /path/to/urbackup/client/303af54c-f58d-4436
...but I'd like this:
# restic snapshots
ID Time Host Paths
52914656 2020-01-26 09:37:37 Server /path/to/urbackup/client/symlink_to_current
4a0b9a9d 2020-01-27 08:48:01 Server /path/to/urbackup/client/symlink_to_current
a8d8ce4a 2020-01-28 09:32:13 Server /path/to/urbackup/client/symlink_to_current
Of course, there's always a chance I'm over-valuing that consistency in some way I don't yet grasp, but for the purposes of scripting restores, it seems valuable to me at this point that the script readily knows which snapshots are (conceptually) from the same source.
Put another way, urbackup's intent in making that symlink is to provide an unchanging path as an abstraction, and I'd like to hold on to that.
it dereferences/follows the symlink
find shouldn't do that with the trailing /.. could you try the command?
Sorry...I think I missed the gist of your original suggestion.
Since that results in find feeding a series of args, I get something to this effect:
ID Time Host Paths
a8d8ce4a 2020-01-28 09:32:13 Server /path/to/urbackup/client/symlink_to_current
/path/to/urbackup/client/symlink_to_current/file1
/path/to/urbackup/client/symlink_to_current/file2
/path/to/urbackup/client/symlink_to_current/file3
/path/to/urbackup/client/symlink_to_current/file4
/path/to/urbackup/client/symlink_to_current/file5
...
And so on. Which is interesting, so thanks for your crafty suggestion!
And as for the desirability/acceptability of that result...hmm...I think I can't quite answer that just yet because I can't quite wrap my brain around what it means, in the end.
And by that I mean that I don't quite grasp how having a single snapshot of /path/to/a/dir differs (or doesn't) from having a single snapshot of /path/to/a/dir, /path/to/a/dir/file1, /path/to/a/dir/file2, /path/to/a/dir/file3, etc.
I suppose, at the very least, I'm not too excited about a quarter-million lines of output from restic snapshot. Or might that not be the result?
You can use restic snapshots -c to get a single line per snapshot.
I really appreciate the assistance! Thanks very much.
However, I freely admit at this point that I'm unclear if there's no downside to this approach, or a major gotcha to it waiting for me down the road.
It feels like feeding restic a quarter-million arguments instead of one could be a negative in some way, but I also accept that find might simply be doing a bit of restic's work for it, and this may therefore simply be a slick solution.
Time will tell :)
Wouldn't #2092 be more useful for this?
To follow symlinks we'd either have to mount the symlink targets in the checkpoint at the location of the symlink which can probably lead to a lot of duplicated file scanning and reading. Or when we include the files at their original location, then restoring is complicated because the data in the snapshot is not available at the path one might expect.