borg
borg copied to clipboard
Add UNIX domain socket transport
/kind enhancement
When creating full system backups one must usually run Borg as root
to avoid permission issues. However, it might be undesirable to store these backups as root, as people often prefer storing backups in known places (like /var/backups
) owned by dedicated users (like the default backup
user in Debian-based distributions).
Right now there are only two possible solutions to accomplish this:
- One might
chown
the repo afterborg create
et al. exits - One might abuse Borg's remote repository feature to connect to
localhost
using SSH as different user (as suggested by @ThomasWaldmann, see references below)
IMO this is a reasonable use case and Borg should support it without taking such rather desperate solutions.
Thus I'd like to suggest adding a socket://
transport (like socket:///run/borg/borg.sock
) to Borg. The socket should be created by borg serve
by accepting an additional --socket
option with a path (e.g. borg serve --socket /run/user/1000/borg/borg.sock
, umask 0117). How borg serve
is being invoked is up to the user. borg create
et al. can now connect to this socket using the socket://
transport (e.g. borg create socket:///run/borg/borg.sock /path/to/backup
).
Even though this is not the goal, it also makes https://github.com/borgbackup/borg/blob/master/docs/deployment/pull-backup.rst#socat easier by eliminating socat
. It might also enable some more use cases I just can't think of right now.
As I don't know Borg's sources I can't really judge the complexity, but since Borg already supports remote transports with borg serve
, this could be as easy as replacing sys.stdin
/sys.stdout
of borg serve
with a UNIX domain socket and to let borg create
et al. communicate with this socket instead of the ssh
subprocess.
References
For non-ssh repos and running borg as root, but not having the repo owned by root, you can use the ssh://user@localhost/myrepo trick.
- https://github.com/borgbackup/borg/issues/4082#issuecomment-423679676
You can work around that problem by using borg with
user@localhost:/path/to/repo
as repo (in that scenario, borg client can run as root and borg serve would run as user.
- https://github.com/borgbackup/borg/issues/3587#issuecomment-443694439
Sounds good, thanks for the detailled suggestion!
https://docs.python.org/3/library/socket.html
The code in borg.remote
uses FDs (of stdin, stdout, stderr) and os.select/read/write/set_blocking
.
It seems like select()
and setblocking()
can also be used with sockets, so would it be just send/recv
instead of `write/read? How about error / exception handling?
The remote code is rather complex and not very nice to work with, I'ld like to avoid additional complexity or risks there.
Sockets can be turned into file objects for feature parity using the makefile api.
https://docs.python.org/3/library/socket.html#socket.socket.makefile
If those readable / writable objects can be used, it may be possible without pain.
The only caveat would be that the socket server would have a one client at a time limit for lock parity.
But socket.makefile
returns a python file
object, not a OS-level FD (int
).
in the case of socket fd integers - on unix read/write are equivalent to send/recv without flags
@ThomasWaldmann You can call fileno()
on the file
object to get the fd from a socket.makefile()
(Popen.stdin
and Popen.stdout
are file
objects, too). As pointed out by @RonnyPfannschmidt, there should be no differences on Unix-like systems. Windows is a whole different story though, but since Windows isn't supported anyway, this should be no major issue. However, I must admit that I personally never used socket.makefile()
, but only the high-level socket methods...
Thanks to #6270 giving me a nice starting point I just started looking into Borg's remote code and I agree that the current implementation is indeed not so easy to work with. The code very much confused me, especially whether borg serve
relies on stderr. If borg serve
relies on stderr, this is going to be a big issue - because with a Unix socket, you just don't have the separation of stdout and stderr, but only a single bidirectional channel. There are solutions, but it isn't as easy as initially thought then...
However, looking at the code, borg serve
doesn't seem to use stderr. Even though it defines a stderr_fd
, it only uses it for a single error, but isn't set to stderr anyway, but stdout, and won't be sent over the wire anyway. So, unless I'm missing something, borg serve
doesn't use stderr.
https://github.com/borgbackup/borg/blob/1e1c92229807f7522a2f7a6aa4bba3bfe46ddbaf/src/borg/remote.py#L198-L201
What confuses me about this is that the "client" part, i.e. the Popen
call, is differentiating between stdout and stderr:
https://github.com/borgbackup/borg/blob/1e1c92229807f7522a2f7a6aa4bba3bfe46ddbaf/src/borg/remote.py#L589-L592
Popen's stderr is then sent (here) to handle_remote_line()
, which is doing quite sophisticated things to send this stuff to logging (and logging only, which is good). So, looking at this (and not understanding much more about Borg's code :exclamation:), I presume that some older versions of borg serve
did in fact use stderr, but no longer. If this is true, handle_remote_line()
only exists for backwards compatibility reasons and can be removed in Borg 2.0, because Borg 2.0 isn't supporting old clients anyway. All server errors are sent via stdout and the only thing for which stderr is still needed, is to catch error output of ssh
, e.g. reporting about failed auth.
If all of this is true, it's basically really just replacing the fds on both ends by the fds of a socket
. The fds behave the same. Also see #6270 and my comment there.
The only caveat would be that the socket server would have a one client at a time limit for lock parity.
On a important side note: My intention with this suggestion is not making borg serve
a "real" server accepting multiple clients simultaneously. For now it's totally fine to still require one borg serve
process (and unix socket) per client, just as before, and therefore only accept a single client at a time. However, just to make this clear, this can be expanded to support multiple clients simultaneously at some later point. It naturally requires threading then, which AFAIK is a bigger topic for Borg, therefore it's intentionally out of the scope of this suggestion. But it's laying some ground work...
However, out of curiosity @RonnyPfannschmidt, why is socket.makefile()
limiting it to a single client? As I said earlier, I never used socket.makefile()
, but unless I understand the API docs wrong, we can (and actually must) call this per client and therefore get separate fds per client?
Makefile isn't, the need for locking limits based on usage patterns
@PhrozenByte btw, windows support is coming, @RayyanAnsari works on it (no remote repos yet though).
About Location
specifier for --repo
:
socket:///abs/path/to/borg.sock::/abs/path/to/borg.repo
Is that a good, usual syntax or are there better ideas?
We need to transport socket path and repo path via that URL.
Update:
hmm, can we always assume that both must be local paths? So we could create the socket file inside the repo path? If so, we could use:
socket:///abs/path/to/borg.repo
and just create the socket inside the repo dir as "socket" or ".socket".
:heart: for #7519 :blush:
About
Location
specifier for--repo
:socket:///abs/path/to/borg.sock::/abs/path/to/borg.repo
Is that a good, usual syntax or are there better ideas?
It mimics Borg's ssh://
transport and follows URL-style syntax, so it is common for sure. The number of slashes might be a bit confusing. Strictly speaking, URLs only require <scheme>:<address>
, therefore socket:<absolute path>
(e.g. socket:/abs/path/to/repo
) would be fine, too. Other standards commonly use the double slashes to indicate a server address (e.g. //borgbackup.org
in https://borgbackup.org
), but this rather is a later interpretation than actual standard. The internet is a inconsistent mess: Think of the file:
scheme, it was meant to address local files only, but for some reason one can specify a hostname nobody ever needs, resulting in file:///abs/path/to/…
. The double slashes have a big advantage though: A user unaware of socket:
might actually mean ./socket:/some/path
with borg create socket:/some/path
("socket:"
is a totally valid directory name...). This is less likely with borg create socket:///some/path
...
So, in the end, we can choose what inconsistency we want: We can use socket://<absolute path>
and confuse people with the three slashes, or use socket:<absolute path>
and confuse people why ssh
is requiring the two slashes but socket
isn't. To me that's a matter of personal preference. I wouldn't introduce yet another syntax though.
hmm, can we always assume that both must be local paths?
No.
-
Even if we assume that
borg create
andborg serve
run on the same machine (i.e. both paths are indeed local), we can't assume that the user runningborg create
has access to the repo. Just think of my original use case: My repo is stored in/var/backups/borg
and/var/backups
is owned bybackup
with 700 perms. Sure, my full system backup runs withroot
and therefore has access to the repo, but I also want to runborg create
with the unprivilegeddaniel
user.daniel
can't even see/var/backups/borg
. But depending on the socket's perms he can use the socket. -
Another use case for the socket transport is implementing an improved pull mechanism. See https://github.com/borgbackup/borg/pull/6270#issuecomment-1426341921
-
Yet another use case for the socket transport is proxying. Think of a storage server to which the client has no direct network access, but needs a proxy. The new socket transport would allow one to use a rendezvous server to forward the connection.
So we could create the socket file inside the repo path? If so, we could use:
socket:///abs/path/to/borg.repo
and just create the socket inside the repo dir as "socket" or ".socket".
Am I right that your thinking about requiring local paths originates in the impracticability of socket:///abs/path/to/borg.sock::/abs/path/to/borg.repo
as repo path? First, you're right, that's kinda weird. But I don't like the solution of requiring the socket to be in the repo path for multiple reasons (in no particular order):
- See above, requiring this prevents the use cases mentioned before.
- People might confuse
borg serve --socket
with access restriction. We haveborg serve --restrict-to-*
for that. - Until now
borg serve
was independent of single repos, with this it gets dependant on a single repo. - We limit possible future development of this feature. Right now we have a "single client, single repo" policy, but in the future we might want to extent this to a full "Borg server", allowing an arbitrary number of clients to access an arbitrary number of different repos. Not now for sure, but we should keep the possible future in mind.
- Sockets kinda belong to
/run
.
However, I still see the impracticability of socket:///abs/path/to/borg.sock::/abs/path/to/borg.repo
. So I was thinking what might be a viable alternative and --rsh
came to my mind. Why don't we simply assume the socket to be in the repo, but allow users to specify it otherwise using an option, e.g. --socket
?
So, borg create socket:///abs/path/to/repo
assumes the socket to be in /abs/path/to/repo/socket
, but one can specify otherwise using borg create --socket /run/my-borg-socket socket:///abs/path/to/repo
, just like --rsh
allows one to overwrite the ssh
command used.
borg serve
still receives an arbitrary path for --socket
; it can be /abs/path/to/repo/socket
, but can also be /run/my-borg-socket
, or /some/other/path
, because we might not even be on the same machine but rather let ssh
forward the socket for us. Additionally checking whether the dirname of borg serve --socket
is a Borg repo and implying --restrict-to-repository
then is a good idea nevertheless.
Thanks for the feedback (I had similar ideas/concerns, but just wanted something simple to get going and then refine after feedback).
The reason why I chose double-slash + socket_abs_path is that this could grow support for TCP sockets (by using a server name or IP and port after the double-slash instead of the UDS path). The ::
separator is unusual for a URL though, but guess we can't use just a /
.
Your idea with a separate --socket option would also work, but is a bit inconsistent compared to ssh://server:port/path borg URLs where the server and port are part of the repo url and not contained in a separate (e.g. --ssh) option.
So, how could we get a socket file path at place of SOCKET without parsing ambiguities?
borg create --repo=ssh://user@server:port/path/to/borg.repo # for consistency comparison
borg create --repo=socket://server:port/path/to/borg.repo # possible future with TCP socket
borg create --repo=socket://SOCKET/path/to/borg.repo # UNIX domain socket
Hmm, how about just mandating that SOCKET:
- must have
.sock
suffix - must be an absolute path (start with a slash) ?
That would result in these usage patterns:
borg create --repo=socket:///path/to/borg.repo # SOCKET empty -> UDS in repo dir
borg create --repo=socket:///run/borg.sock/path/to/borg.repo # SOCKET w/ slash and .sock -> path, use that!
borg create --repo=socket://server:port/path/to/borg.repo # SOCKET w/o leading slash -> TCP
A more radical approach could be splitting that url into --remote=SERVER_URL --repo=local_repo_path
.
Yeah, good point. I agree, adding a --socket
option isn't perfect and adds some inconsistency, but if combined with your idea to specify a default location for the socket, a user will easily understand how it works.
Thinking about it, adding the socket's path to --repo
just feels a bit weird to me: with --repo
we're just asking for the repo's location. Locations that are not local paths, must match <transport>://<host>/<path to repo>
, because the host is undoubtedly part of the location. But a socket's path is no host - it's another path. Adding a --socket
option with a default value is very easy to explain, but "magic separators" (like ::
, or requiring an absolute path and a .sock
suffix) are hard to understand. Wasn't this confusion the reason (among the regex getting unmanageable) why repo path and archive names were split up in Borg 2?
Adding a --socket
option just feels like the best option, especially when combined with a default value. Deriving the default value from the repo's path sounds reasonable. So, we end up with this:
borg create --repo=ssh://user@server:port/path/to/borg.repo
borg create --rsh='ssh -i /path/to/id_rsa' --repo=ssh://user@server:port/path/to/borg.repo
borg create --repo=socket://server:port/path/to/borg.repo # possible future with TCP socket
borg create --repo=socket:///path/to/borg.repo # UNIX domain socket in default location
borg create --socket=/other/path/to/borg.sock --repo=socket:///path/to/borg.repo # UNIX domain socket in custom location
In the future we might even not just look in the repo's path, but in other default locations as well. I just had a whole bunch of ideas, e.g. $XDG_RUNTIME_DIR/borg.sock
if we ever implement a "full Borg server". But that doesn't really matter right now, by keeping the syntax (<transport>://<host>/<path to repo>
) simple and rather use a separate option, we stay flexible.
Hmm, how about just mandating that SOCKET:
- must have
.sock
suffix- must be an absolute path (start with a slash) ?
In practice, this means that .sock
becomes a "magic separator", just as ::
would be. IMHO we should avoid this. It's hard to explain and might conflict with otherwise valid paths.
A more radical approach could be splitting that url into
--remote=SERVER_URL --repo=local_repo_path
.
You mean like --remote=ssh://server:port --repo=/path/to/borg.repo
? Yeah, pretty radical, but reasonable. --remote
and --repo
would then strongly depend on each other. If a user forgets adding --remote
(e.g. in a wrapper script appending options), Borg might happily do things wrong without even noticing. Plus, we'd still need the --rsh
option, right? IMO it feels like this would make things harder.
@PhrozenByte about stderr: yeah, the borg serve function does not really use it, but the borg repository code emits logging messages on stderr (which usually go either directly to e.g. the console (for a local repo) or get sent via the ssh connection to the clientside code in RemoteRepository).
So, guess we have that problem that there is no easy solution with just 1 bidi UDS.
logging.handlers.SocketHandler
maybe? Hmm, maybe rather not - that would need the borg client being the "log server" side (maybe not a big issue for UDS, but for TCP socket).
So, guess all we have left is to multiplex stdout and stderr over one channel (and de-mux on the other side).
Why is there no fd012-mux-socket-demux-fd012 tool? :-)
That's very, very unfortunate :unamused:
Since Borg uses Python's built-in logging facility, I'd indeed recommend creating a logging.handlers.RemoteHandler
that is responsible for sending not just log messages, but also exceptions and errors of borg serve
(including Borg's repo code) to the client. This handler can and should be used for SSH-based connections, too. To do so we could hook into setup_logging()
(replace is_serve=False
(and stream=None
) by handler=StreamHandler
, so that we can pass handler=RemoteHandler
for the socket transport) and more or less move the existing code for exceptions to this new logging.handlers.RemoteHandler
. Additional work is required to handle the log messages on the client side, which definitely requires diving deep into the remote code... I can totally understand that this isn't desirable.
To avoid this we could multiplex stdout and stderr, by e.g. prefixing all messages with a channel indicator (a single byte would do, e.g. 1
for stdout and 2
for stderr) and more or less send the current logging messages unchanged (i.e. no message packing required; you can then keep the current exception handling unchanged). This won't save us much work on the server side (i.e. borg serve
) because we must still hook into the logging facility (i.e. add a logging.handlers.RemoteHandler
) to change where the log messages are sent to. But it saves a lot work on the client side, because checking for the channel indicator can be done very early in processing (possibly right after os.read()
). However, please note that this will break any attempt to support older clients (but that's fine I guess, we break support anyway). For future clients it won't be much of a problem apart from having a possibly unused channel indicator at the beginning of every message.
Yet another solution is to quite literally open a second channel for stderr (i.e. calling socket.connect()
resp. socket.accept()
twice and using the second connection for stderr). It's by far the easiest solution, doesn't require much work on both sides, however, it has the major disadvantage of being incompatible with a possible future TCP-based transport (connecting twice is fine for a unix socket, but not for TCP) and creates future BC issues in general (abolishing the second connection would be a BC-breaking change).
I'd vote for the second approach, it's a good compromise. But indeed, this idea is no longer as easy as I initially thought it would be :disappointed:
logging.handlers.SocketHandler
maybe? Hmm, maybe rather not - that would need the borg client being the "log server" side (maybe not a big issue for UDS, but for TCP socket).
I'm not 100% sure what you mean. Are you referring to some log messages actually belonging to the server, not the client? I agree, this is a delicate topic, but there's no solution other than checking every single log message whether it's meant for the client, or the server. However, due to our "one client, one socket" policy, we don't really have to think about this right now: we just send everything to the client, the same way we already do for SSH. How we do this (see above), doesn't really matter IMHO. We won't really need this differentiation until we implement a "full Borg server".
about socket file in the repo directory:
that would limit operations to existing repos, we can't have the socket file in a repo dir that does not yet exist. so guess we rather forget about that.
Great work Thomas! :clap: :heart:
I agree. However, the issue what path to use for --socket
by default remains an issue tough...
The "right place" for the socket is $XDG_RUNTIME_DIR
for unprivileged users, e.g. $XDG_RUNTIME_DIR/borg/borg.sock
(let's reserve a directory, who knows for what we might wanna use it in the future...), and /run/borg/borg.sock
for root.
However, there are two open questions with $XDG_RUNTIME_DIR
:
- What to do when
$XDG_RUNTIME_DIR
is not set - What to do when
$BORG_BASE_DIR
is set
First, what's the right place for the socket if $XDG_RUNTIME_DIR
is not set? The XDG basedir spec doesn't really specify a default location for $XDG_RUNTIME_DIR
, but Systemd basically made /run/user/$UID
the default. We might do the same. However, if /run/user/$UID
doesn't exist yet, we probably won't be able to create it due to insufficient permissions; thus I'd vote against /run/user/$UID
as "implicit" default if $XDG_RUNTIME_DIR
is not set and rather use a "safe" fallback. Again, the XDG basedir spec doesn't really tell us what to do, just to use "a replacement directory with similar capabilities". Some recommend /tmp/run
, but IMHO this is a rather weird alternative... I'd rather recommend falling back to $BORG_CACHE_DIR/run
.
Second, what shall we do when $BORG_BASE_DIR
is set? AFAIK this variable was created before Borg followed the XDG basedir specs. Since $BORG_BASE_DIR
plays pretty nice with the XDG basedir specs for configs and caches, there was no reason not to keep it. But the runtime dir is different. If we assume $BORG_BASE_DIR
to be the definitive "I want all my stuff there" answer of the user, I'd say that if $BORG_BASE_DIR
is set, we always use $BORG_CACHE_DIR/run
, no matter whether $XDG_RUNTIME_DIR
is set or not, for both unprivileged users and root.
Last, we should allow the user to give us a "definitive answer" where one wants his stuff... So a $BORG_RUNTIME_DIR
variable should be added, too.
Just for record, I was also thinking about a $BORG_SOCKET_DIR
variable, but would recommend to be less specific (i.e. better just use $BORG_RUNTIME_DIR
): If we - some day, maybe never - add a "full Borg server", we will need a location for other runtime data, e.g. the daemon's PID. Adding yet another env variable for a single file is just overkill... That's also the reason why I recommend using $BORG_CACHE_DIR/run
, not e.g. $BORG_CACHE_DIR/sockets
.
So, let's wrap this up: We always use $BORG_RUNTIME_DIR/borg.sock
. If $BORG_RUNTIME_DIR
is set, use it. If it is not set, but $BORG_BASE_DIR
is set, set $BORG_RUNTIME_DIR
to $BORG_CACHE_DIR/run
and use this location. If neither variables are set, we differentiate between root and unprivileged users: For root, set $BORG_RUNTIME_DIR
to /run/borg
and use this location. For unprivileged users, set $BORG_RUNTIME_DIR
to $XDG_RUNTIME_DIR/borg
if $XDG_RUNTIME_DIR
is set, otherwise set it to $BORG_CACHE_DIR/run
and use this location.
For most of the logic, we want to use platformdirs
package, they have a user_runtime_dir()
function:
https://github.com/platformdirs/platformdirs/blob/main/src/platformdirs/unix.py#L155
Misc. fs helpers already exist there:
https://github.com/borgbackup/borg/blob/master/src/borg/helpers/fs.py
Note: platformdirs
does not special case root
, nor do they have a site
runtime dir.
socket file permissions:
due to constants.UMASK_DEFAULT == 0o0077
, the socket file has 0o700
permissions (user only, no permissions for group).
Would that be a problem? Usually we have these scenarios:
- run borg client and server as same user: no problem
- run borg client as root and server as non-root: no problem either
Hmm, guess if one had a scenario where multiple different users would want to access the same borg serve
socket, then we would need group permissions and all the users being members of that group.
socket file location for different user scenario:
obviously, the default socket location would not work, so guess the location of the socket created by borg serve
has to be given to the borg client via --socket=/that/path
.
Yes, group r+w is required for scenarios with different unprivileged users on the same machine; this includes my original scenario in which the backup
user (or maybe a distinct borg
user) runs borg serve
. Using umask 007
for the socket must be documented though; users must be aware that by using sockets, all members of the group borg serve
is run with gain access to repos (and one might run borg serve
with a group different from the primary group). For these scenarios one must always pass --socket
(note: not for all scenarios though, e.g. for a proxied socket one won't necessarily need --socket
). However, I don't think that this is an issue, because such setups always require manual setup anyway (that's what I meant with "How borg serve
is being invoked is up to the user")...
About using user_runtime_dir()
from the platformdirs
package: It looks like that this lib works fine for unprivileged users, but yields /run/user/0
for root. This is a rather unorthodox location. As common practice it should be just /run
for root.
@PhrozenByte could you file a bug at platformdirs
?
@PhrozenByte could you file a bug at
platformdirs
?
ref https://github.com/platformdirs/platformdirs/issues/188
@PhrozenByte merged stuff into master.
if you have time, guess it would be good to give it some practical testing.
maybe the long running borg serve --socket
shows some yet unknown issues.