[2.6 beta1 w/ dco] SIGUSR1 causing crash. persist-remote-ip but with static remote ip specified(?)
Describe the bug with persist-tun:
2022-12-07 16:13:31 UDPv4 link local: (not bound)
2022-12-07 16:13:31 UDPv4 link remote: [AF_INET]**127.0.0.1:12345**
2022-12-07 16:13:31 dco_do_write: netlink reports error (-1): Unspecific failure
2022-12-07 16:13:31 dco_do_write: failed to send netlink message: No route to host (-113)
2022-12-07 16:13:31 write UDPv4 []: Success (fd=4,code=0)
2022-12-07 16:13:33 dco_do_write: netlink reports error (-1): Unspecific failure
2022-12-07 16:13:33 dco_do_write: failed to send netlink message: No route to host (-113)
2022-12-07 16:13:33 write UDPv4 []: Success (fd=4,code=0)
then stuck, not crashing, a SIGHUP can "fix" it
without persist-tun:
2022-12-07 16:13:31 UDPv4 link local: (not bound)
2022-12-07 16:13:31 UDPv4 link remote: [AF_INET]127.0.0.1:12345
crashed
To Reproduce Ubuntu 22. dco on.
client config:
remote 127.0.0.1:12345
persist-local-ip
persist-remote-ip
persist-tun
persist-key
Expected behavior 2.5.8 good. remove persist-remote-ip is good as 2.5.8.
Version information (please complete the following information):
- OS: Ubuntu 22.04
- OpenVPN version: 2.6 beta1
- Repeat for peer if relevant
That client config looks incomplete - please show the full client config (without the actual key material, of course).
Is this part of a real setup? Talking to 127.0.0.1 would not work normally for a production setup.
yes, 127.0.0.1:12345 is another tunnel, actual remote route manually bypassed
"yes" is not an answer to "please show full client config" - if I'm to fix this issue, I need to understand what you are trying to achieve.
#persist-tun
proto udp
tun-mtu 1428
remote 127.0.0.1 12345
persist-local-ip
explicit-exit-notify 2
connect-retry 1 3
client
nobind
allow-compression no
data-ciphers AES-128-GCM
auth-nocache
script-security 2
verb 3
route-up /etc/openvpn/route-up.sh
route-pre-down /etc/openvpn/route-pre-down.sh
remote-cert-tls server
tls-crypt /etc/openvpn/tlscrypt.key
ca /etc/openvpn/ca.crt
cert /etc/openvpn/c.crt
key /etc/openvpn/k.key
persist-key
ping 0
ping-restart 3600
replay-window 5000 3
mute 8
mlock
fast-io
A very generic config
I'll moving on without persist-remote-ip because remote is not a domain name anyway, but somewhere there is a bug
A very generic config
there is lots of stuff in your config that makes little sense, like fast-io or ping-restart 3600 together with ping 0. Also persist-local-ip only makes sense in a config where you have bind and local $domainname... and auth-nocache makes no sense if you don't actually have any cacheable authentication enabled.
That said, it's still likely that the combination of --persist-remote-ip and DCO is broken, so we'll have a very close look.
I don't want ping static interval(avoid some detection), and ping-restart 3600 can be ping-restart 360000 or whatever because it'll almost never get triggered. auth-nocache is for password auth I used before cert + TLS. No idea what fast-io does indeed, ovpn doesn't use sendmmsg(?) for some reason?
Both ping and ping-restart default to 0 = off.
sendmmsg() is not used because nobody coded it yet, but with DCO this point has become moot - data path can now go into the kernel, and userland packet handling can stay as it is (single packet at a time). Initial experiments with sendmmsg() did not resulted in such a great improvement, but at the same time made the code more complex.
Another question, tho - what is listening on 127.0.0.1:12345? Is this a local OpenVPN instance?
Yes after DCO this doesn't matter anymore, dco is even multithreaded(?). 12345 is an obfuscation tunnel. Actual remote is 100ms away, works pretty well for 5+ years. I'm not sure if persist-local-ip will prevent port switching of local endpoint because that'll cause the 12345 tunnel to be renegotiated.
the localport can be specified with --lport and it will remain static.
the localport can be specified with
--lportand it will remain static.
as long as there is nobind, nobody cares about lport...
LOL checked log it did change but it's so fast that I never noticed. Change it after the next maintainance XD.
running my client this way, but can't generate any crash:
openvpn --dev tun
--client
--ca ../../test-pki/pki/ca.crt
--cert ../../test-pki/pki/issued/client1.crt
--key ../../test-pki/pki/private/client1.key
--verb 3
--persist-local-ip
--persist-remote-ip
--persist-tun
--persist-key
--fast-io
--connect-retry 1 3
--nobind
--allow-compression no
--auth-nocache
--script-security 2
--ping 0
--ping-restart 3600
--replay-window 5000 3
--mlock
--tun-mtu 1428
--remote 127.0.0.1 12345
maybe it is also related to what is listening on the other side?
I get
2022-12-07 11:43:34 UDPv4 link local: (not bound)
2022-12-07 11:43:34 UDPv4 link remote: [AF_INET]127.0.0.1:12345
2022-12-07 11:43:34 read UDPv4 [ECONNREFUSED]: Connection refused (fd=3,code=111)
but even if I add some dumb listener on that port, I get nothing
2022-12-07 11:49:38 UDPv4 link local: (not bound)
2022-12-07 11:49:38 UDPv4 link remote: [AF_INET]127.0.0.1:12345
Are you: first, you get an established tunnel, then send a SIGUSR1 ?
oh ok, I missed that. thanks
Huh it turns out it's a bad idea to use static local port. Because server is not getting an explicit-exit-notify when client is getting a SIGUSR1, it will wait until TLS timeout(around 15s in my config) if port is same. Expected behavior or bug?
killall openvpn on server get this:
2022-12-07 11:26:09 event_wait : Interrupted system call (fd=-1,code=4)
2022-12-07 11:26:09 SENT CONTROL [Client]: 'RESTART' (status=1)
2022-12-07 11:26:11 Closing DCO interface
but client receives nothing...?
please do not mix unrelated problems in one GH issue. Also, you talk about "client getting SIGUSR1" and show a log file from the server getting a SIGINT. Please open a new issue about the --explicit-exit-notify problem, and include client and server logs, with --verb 3 each.
I'd like to do so. Later.
Can we close this ticket? I don't think there has been any other report about this issue? And this also was reported on an very old version at this point.