zrepl
zrepl copied to clipboard
ssh push job example
I am trying to setup zrepl over ssh, with a push+sink and can't quite make it work. I've configured the sender with ssh+stdinserver:
jobs:
- name: vms_to_central
type: push
connect:
type: ssh+stdinserver
host: 192.168.10.11
user: root
port: 22
identity_file: /root/.ssh/id_rsa
...
and receiver:
jobs:
- name: dus_vms_sink
type: sink
root_fs: "backups"
serve:
type: stdinserver
client_identities:
- "<sender_hostname>"
I guess, there is some obvious mistake in the config, but I can't see it. It works with tcp, but I would prefer ssh. Sorry if there is an answer in the docs, I could not find it. Is it possible to setup push+sink config via ssh?
Please provide the error messages and/or log output that you are encountering.
Is it possible to make setup push+sink config via ssh?
Yes, it's definitely possible.
Please familiarize yourself with https://zrepl.github.io/configuration/transports.html#ssh-stdinserver-transport
Also, a common problem is that the known_hosts
entry must already exist.
Thank you for your response. I initially used the ssh-stdinserver-transport
example from the docs as the starting point and adapted it from source+pull scheme to push+sink. My guess was, that I messed up with adaptation.
Here is the output of systemctl status zrepl
:
Mar 15 11:03:28 sender-hostname zrepl[21145]: [vms_to_central][rpc.ctrl][kNoL$aK0q$aK0q.cAaS.4gSr]: cannot connect err="dial_timeout of 10s exceeded" Mar 15 11:03:28 sender-hostname zrepl[21145]: [vms_to_central][repl][kNoL$aK0q$aK0q.cAaS.4gSr.TUvH./XUf]: error listing receiver filesystems err="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial_timeout of 10s exceeded"" errType="*status.statusError" Mar 15 11:03:28 sender-hostname zrepl[21145]: [vms_to_central][repl][kNoL$aK0q$aK0q.cAaS.4gSr.TUvH]: most recent error in this attempt attempt_number="0" err="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial_timeout of 10s exceeded"" Mar 15 11:03:28 sender-hostname zrepl[21145]: [vms_to_central][repl][kNoL$aK0q$aK0q.cAaS.4gSr.TUvH]: temporary connectivity-related error identified, start waiting for reconnect deadline="2021-03-15 11:13:28.696991772 +0100 CET m=+1265.390770607" attempt_number="0" Mar 15 11:03:38 sender-hostname zrepl[21145]: [vms_to_central][rpc.data][kNoL$aK0q$OpTB$OpTB.32PA]: ping failed err="dial_timeout of 10s exceeded" Mar 15 11:03:38 sender-hostname zrepl[21145]: [vms_to_central][rpc.ctrl][kNoL$aK0q$OpTB$OpTB.32PA]: ping failed err="rpc error: code = Canceled desc = context canceled" Mar 15 11:03:38 sender-hostname zrepl[21145]: [vms_to_central][repl][kNoL$aK0q$aK0q.cAaS.4gSr.TUvH]: reconnecting failed, aborting run err="receiver is not reachable: control and data rpc failed to respond to ping rpcs" attempt_number="0" Mar 15 11:03:41 sender-hostname zrepl[21145]: [vms_to_central][rpc.ctrl][kNoL$aK0q$aK0q.cAaS.4gSr]: cannot connect err="dial_timeout of 10s exceeded"
While ssh [email protected]
works fine with key auth.
A couple of questions:
- Which zrepl version are you using?
- Which OS & OpenSSH version are you using?
- Check specifically whether your OpenSSH's version supports the
restrict
forced command if you used it (see the tip highlight in the doc page that I linked to)
- Check specifically whether your OpenSSH's version supports the
- What about the sink? Does it show any logs?
- The logs indicate that it's a timeout error. That's untypical for config-level errors. Could you check whether the connecting side actually makes a connection attempt, using
tcpdump
?