openvpn icon indicating copy to clipboard operation
openvpn copied to clipboard

Client tunnel with DCO aborts if tun iface was created with --mktun

Open alexvelkov1 opened this issue 1 year ago • 10 comments

Hello everybody,

I have an issue with an OpenVPN tunnel on the client that does not come UP, because it cannot create a DCO interface - Error: sitnl_send: rtnl: generic error (-17): File exists

The OpenVPN client is running the latest v2.6.6 compiled with DCO on an ARM device and the DCO kernel module loaded, which is connecting to an OVPN server running v2.4.7 on a Ubuntu v20.04. An important notice to mention is that I create the TUN device with openvpn --mktun --dev tunG --dev-type tun in advance before initializing the tunnel.

The tunnel is based on certificates and the authentication passes without any problems, so I think that the server logs are irrelevant for this issue.

Thanks for the help.

Client infos and logs:

openvpn --version OpenVPN 2.6.6 arm-linux-musleabi [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] [DCO] library versions: OpenSSL 1.1.1u 30 May 2023, LZO 2.10 DCO version: 2.0.0 Originally developed by James Yonan Copyright (C) 2002-2023 OpenVPN Inc <[email protected]> Compile time defines: enable_async_push=no enable_comp_stub=no enable_crypto_ofb_cfb=yes enable_dco=yes enable_dco_arg=yes enable_debug=no enable_dependency_tracking=no enable_dlopen=unknown enable_dlopen_self=unknown enable_dlopen_self_static=unknown enable_doc=no enable_docs=no enable_documentation=no enable_fast_install=yes enable_fragment=yes enable_gtk_doc=no enable_gtk_doc_html=no enable_iproute2=no enable_ipv6=yes enable_libtool_lock=yes enable_lz4=yes enable_lzo=yes enable_management=no enable_multihome=no enable_nls=no enable_pam_dlopen=no enable_pedantic=no enable_pkcs11=no enable_plugin_auth_pam=no enable_plugin_down_root=no enable_plugins=no enable_port_share=yes enable_selinux=no enable_shared=yes enable_shared_with_static_runtimes=no enable_small=no enable_static=no enable_strict=no enable_strict_options=no enable_systemd=no enable_unit_tests=no enable_werror=no enable_win32_dll=yes enable_wolfssl_options_h=yes enable_x509_alt_username=no with_aix_soname=aix with_crypto_library=openssl with_fop=no with_gnu_ld=yes with_mem_check=no with_openssl_engine=no with_sysroot=no with_xmlto=no

# cat tunG.conf
float
resolv-retry 60
remote-random
remote 172.16.0.9 1194 udp
ifconfig-noexec
lport 1194
dev tunG
dev-type tun
cipher AES-128-CBC
auth SHA256
keepalive 10 120
tls-client
ca /var/etc/openvpn/tunG/ca.pem
cert /var/etc/openvpn/tunG/my_cert.pem
key /var/etc/openvpn/tunG/my_key.pem
tls-verify "..."
pull
replay-window 64
script-security 2
up-delay
up-restart
block-ipv6
verb 7
#
# Logs:

Aug 30 14:31:38 notic tunG    [ 3224]: OpenVPN 2.6.6 arm-linux-musleabi [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] [DCO]
Aug 30 14:31:38 notic tunG    [ 3224]: library versions: OpenSSL 1.1.1u  30 May 2023, LZO 2.10
Aug 30 14:31:38 notic tunG    [ 3224]: DCO version: 2.0.0
Aug 30 14:31:38 warn  tunG    [ 3226]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Aug 30 14:31:38 notic tunG    [ 3226]: MTU: adding 426 buffer tailroom for compression for 1768 bytes of payload
Aug 30 14:31:38 notic tunG    [ 3226]: Control Channel MTU parms [ mss_fix:0 max_frag:0 tun_mtu:1250 tun_max_mtu:0 headroom:126 payload:1600 tailroom:126 ET:0 ]
Aug 30 14:31:38 notic tunG    [ 3226]: Data Channel MTU parms [ mss_fix:0 max_frag:0 tun_mtu:1500 tun_max_mtu:1600 headroom:136 payload:1768 tailroom:562 ET:0 ]
Aug 30 14:31:38 notic tunG    [ 3226]: Local Options String (VER=V4): 'V4,dev-type tun,link-mtu 1569,tun-mtu 1500,proto UDPv4,auth SHA256,keysize 128,key-method 2,tls-client'
Aug 30 14:31:38 notic tunG    [ 3226]: Expected Remote Options String (VER=V4): 'V4,dev-type tun,link-mtu 1569,tun-mtu 1500,proto UDPv4,auth SHA256,keysize 128,key-method 2,tls-server'
Aug 30 14:31:38 notic tunG    [ 3226]: TCP/UDP: Preserving recently used remote address: [AF_INET]172.16.0.9:1194
Aug 30 14:31:38 notic tunG    [ 3226]: Socket Buffers: R=[180224->180224] S=[180224->180224]
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 link local (bound): [AF_INET][undef]:1194
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 link remote: [AF_INET]172.16.0.9:1194
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 WRITE [14] to [AF_INET]172.16.0.9:1194: P_CONTROL_HARD_RESET_CLIENT_V2 kid=0 [ ] pid=0 DATA len=0
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 READ [26] from [AF_INET]172.16.0.9:1194: P_CONTROL_HARD_RESET_SERVER_V2 kid=0 [ 0 ] pid=0 DATA len=0
Aug 30 14:31:38 notic tunG    [ 3226]: TLS: Initial packet from [AF_INET]172.16.0.9:1194, sid=91eeeb7a 9bb7d9e2
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 WRITE [299] to [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ 0 ] pid=1 DATA len=273
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 READ [22] from [AF_INET]172.16.0.9:1194: P_ACK_V1 kid=0 [ 1 ] DATA len=0
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 READ [1188] from [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ ] pid=1 DATA len=1174
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 WRITE [26] to [AF_INET]172.16.0.9:1194: P_ACK_V1 kid=0 [ 1 0 ] DATA len=0
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 READ [1188] from [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ ] pid=2 DATA len=1174
Aug 30 14:31:38 notic tunG    [ 3226]: VERIFY SCRIPT OK: ...
Aug 30 14:31:38 notic tunG    [ 3226]: VERIFY OK: ...
Aug 30 14:31:38 notic tunG    [ 3226]: VERIFY SCRIPT OK: depth=0, ...
Aug 30 14:31:38 notic tunG    [ 3226]: VERIFY OK: depth=0, ...
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 WRITE [30] to [AF_INET]172.16.0.9:1194: P_ACK_V1 kid=0 [ 2 1 0 ] DATA len=0
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 READ [207] from [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ ] pid=3 DATA len=193
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 WRITE [1222] to [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ 3 2 1 0 ] pid=2 DATA len=1184
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 WRITE [1115] to [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ 3 2 1 0 ] pid=3 DATA len=1077
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 READ [22] from [AF_INET]172.16.0.9:1194: P_ACK_V1 kid=0 [ 2 ] DATA len=0
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 READ [184] from [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ 3 ] pid=4 DATA len=158
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 WRITE [34] to [AF_INET]172.16.0.9:1194: P_ACK_V1 kid=0 [ 4 3 2 1 ] DATA len=0
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 READ [235] from [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ ] pid=5 DATA len=221
Aug 30 14:31:38 notic tunG    [ 3226]: Control Channel: TLSv1.3, cipher TLSv1.3 TLS_AES_256_GCM_SHA384, peer certificate: 2048 bit RSA, signature: RSA-SHA1
Aug 30 14:31:38 notic tunG    [ 3226]: [testdevice2] Peer Connection Initiated with [AF_INET]172.16.0.9:1194
Aug 30 14:31:38 notic tunG    [ 3226]: TLS: move_session: dest=TM_ACTIVE src=TM_INITIAL reinit_src=1
Aug 30 14:31:38 notic tunG    [ 3226]: TLS: tls_multi_process: initial untrusted session promoted to trusted
Aug 30 14:31:38 notic tunG    [ 3226]: UDPv4 WRITE [34] to [AF_INET]172.16.0.9:1194: P_ACK_V1 kid=0 [ 5 4 3 2 ] DATA len=0
Aug 30 14:31:39 notic tunG    [ 3226]: SENT CONTROL [testdevice2]: 'PUSH_REQUEST' (status=1)
Aug 30 14:31:39 notic tunG    [ 3226]: UDPv4 WRITE [73] to [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ 5 4 3 2 ] pid=4 DATA len=35
Aug 30 14:31:39 notic tunG    [ 3226]: UDPv4 READ [22] from [AF_INET]172.16.0.9:1194: P_ACK_V1 kid=0 [ 4 ] DATA len=0
Aug 30 14:31:39 notic tunG    [ 3226]: UDPv4 READ [170] from [AF_INET]172.16.0.9:1194: P_CONTROL_V1 kid=0 [ ] pid=6 DATA len=156
Aug 30 14:31:39 notic tunG    [ 3226]: PUSH: Received control message: 'PUSH_REPLY,route-gateway 10.5.0.1,topology subnet,ping 5,ping-restart 30,ifconfig 10.5.0.2 255.255.255.0,peer-id 0,cipher AES-256-GCM'
Aug 30 14:31:39 notic tunG    [ 3226]: OPTIONS IMPORT: timers and/or timeouts modified
Aug 30 14:31:39 notic tunG    [ 3226]: OPTIONS IMPORT: --ifconfig/up options modified
Aug 30 14:31:39 notic tunG    [ 3226]: OPTIONS IMPORT: route-related options modified
Aug 30 14:31:39 notic tunG    [ 3226]: OPTIONS IMPORT: peer-id set
Aug 30 14:31:39 notic tunG    [ 3226]: OPTIONS IMPORT: data channel crypto options modified
Aug 30 14:31:39 notic tunG    [ 3226]: open_tun_dco: tunG
Aug 30 14:31:39 notic tunG    [ 3226]: net_iface_new: add tunG type ovpn-dco
Aug 30 14:31:39 notic tunG    [ 3226]: sitnl_send: checking for received messages
Aug 30 14:31:39 notic tunG    [ 3226]: sitnl_send: rtnl: received 96 bytes
Aug 30 14:31:39 warn  tunG    [ 3226]: sitnl_send: rtnl: generic error (-17): File exists
Aug 30 14:31:39 notic tunG    [ 3226]: Cannot create DCO interface tunG: -17
Aug 30 14:31:39 notic tunG    [ 3226]: DCO device tunG already exists, won't be destroyed at shutdown
Aug 30 14:31:39 notic tunG    [ 3226]: /sbin/openvpn-up-down.sh tunG tunG 1500 0 10.5.0.2 255.255.255.0 init
Aug 30 14:31:39 notic tunG    [ 3226]: dco_new_peer: peer-id 0, fd 3, remote addr: [AF_INET]172.16.0.9:1194
Aug 30 14:31:39 err   tunG    [ 3226]: dco_new_peer: netlink reports device not found:
Aug 30 14:31:39 notic tunG    [ 3226]: Exiting due to fatal error
Aug 30 14:31:39 notic tunG    [ 3226]: Closing DCO interface
Aug 30 14:31:39 notic tunG    [ 3226]: close_tun_dco
Aug 30 14:31:39 notic tunG    [ 3226]: net_iface_del: delete tunG

Server infos:

# openvpn --version OpenVPN 2.4.7 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Sep 5 2019 library versions: OpenSSL 1.1.1f 31 Mar 2020, LZO 2.10 Originally developed by James Yonan Copyright (C) 2002-2018 OpenVPN Inc <[email protected]> Compile time defines: enable_async_push=no enable_comp_stub=no enable_crypto=yes enable_crypto_ofb_cfb=yes enable_debug=yes enable_def_auth=yes enable_dependency_tracking=no enable_dlopen=unknown enable_dlopen_self=unknown enable_dlopen_self_static=unknown enable_fast_install=needless enable_fragment=yes enable_iproute2=yes enable_libtool_lock=yes enable_lz4=yes enable_lzo=yes enable_maintainer_mode=no enable_management=yes enable_multihome=yes enable_pam_dlopen=no enable_pedantic=no enable_pf=yes enable_pkcs11=yes enable_plugin_auth_pam=yes enable_plugin_down_root=yes enable_plugins=yes enable_port_share=yes enable_selinux=no enable_server=yes enable_shared=yes enable_shared_with_static_runtimes=no enable_silent_rules=no enable_small=no enable_static=yes enable_strict=no enable_strict_options=no enable_systemd=yes enable_werror=no enable_win32_dll=yes enable_x509_alt_username=yes with_aix_soname=aix with_crypto_library=openssl with_gnu_ld=yes with_mem_check=no with_sysroot=no

alexvelkov1 avatar Aug 30 '23 16:08 alexvelkov1

An important notice to mention is that I create the TUN device with openvpn --mktun --dev tunG --dev-type tun in advance before initializing the tunnel.

When running the command above, did you get the following warning?

Note: --mktun does not support DCO. Creating TUN interface.

That message is telling you that with DCO you can't use --mktun and therefore you're going to get a classic tun device. Upon boot OpenVPN will recognize that there is already a non-DCO interface with the same name and will abort.

This is expected and the solution is to simply not use any --mktun before running OpenVPN.

I hope it helps.

ordex avatar Aug 30 '23 22:08 ordex

Hi,

On Wed, Aug 30, 2023 at 03:33:37PM -0700, Antonio Quartulli wrote:

This is expected and the solution is to simply not use any --mktun before running OpenVPN.

Nah, not really. We had a long discussion on this, when merging the DCO patch set, and the outcome I remember was "if a tun device is already existing, OpenVPN detects that (the logs confirm that this part works) and uses it as non-DCO device".

This did work at some point, but it might have been broken due to some later change (shuffling the dco-disabling code around)...

gert

Gert Doering - Munich, Germany @.***

cron2 avatar Aug 31 '23 07:08 cron2

Hi,

On Wed, Aug 30, 2023 at 09:35:18AM -0700, alexvelkov1 wrote:

I have an issue with an OpenVPN tunnel on the client that does not come UP, because it cannot create a DCO interface - Error: sitnl_send: rtnl: generic error (-17): File exists

As a quick workaround: if you run a config with a pre-existing tun device (mktun), run OpenVPN with --disable-dco.

Out of curiosity: why exactly do you use --mktun? To get a tun device with a specific name, you can just use "--dev tunG", no --mktun required.

gert

Gert Doering - Munich, Germany @.***

cron2 avatar Aug 31 '23 08:08 cron2

Hi Gert,

As a quick workaround: if you run a config with a pre-existing tun device (mktun), run OpenVPN with --disable-dco.

thanks for the workaround. It is trully strange that the OpenVPN process got abruptly cancelled bringing the tunnel UP, I think it would have been better to continue with the non-DCO device (e.g. with showing an additional notice in the logs).

Out of curiosity: why exactly do you use --mktun?

Of course, well we have embedded the OpenVPN tunnel in a more complex setup - for example, setting additional iptables rules on the device. At that time we were thinking it is a good idea to setup the tunnel in two parts, first the device (which is great, because the device would already exist before the tunnel has come UP) and then the actual tunnel - this would bring performance and security benefits. Now DCO is a nice and major feature/Benefit so that we will rethink our procedure again. Out of curiosity :), why is --mktun problematic with DCO?

Alex

alexvelkov1 avatar Aug 31 '23 09:08 alexvelkov1

Reopening as we want to clarify what the actual behaviour should be.

ordex avatar Aug 31 '23 14:08 ordex

Hi,

On Thu, Aug 31, 2023 at 02:29:47AM -0700, alexvelkov1 wrote:

Out of curiosity :), why is --mktun problematic with DCO?

The code inside OpenVPN is very different for "tun" and "DCO" devices, and the default code path is "use DCO, if there are no options that conflict with it".

Now, "--dev tunG" is not "conflicting", so the code assumes "we want DCO", and later discovers "oh, the interface has already been create, and is not a DCO device, so what do I do now? aaah, confusion!".

I am fairly sure we had an answer to this in some iteration of the pre-2.6.0 code ("what do I do now? file a complaint, and use tun mode"), but this is a niche use, using a niche code path, and these tends to break when related stuff is rewritten.

We do test this stuff :-) - but there are too many combinations of options/remotes to test in limited time, so niche features get tested "when it looks like it might break", but not always in automated ways.

gert

Gert Doering - Munich, Germany @.***

cron2 avatar Aug 31 '23 16:08 cron2

what @cron2 says is what we have also written in the code:

336     /* if the device name is fixed, we need to check if an interface with this
337      * name already exists. IF it does, it must be a DCO interface, otherwise
338      * DCO has to be disabled in order to continue.
339      */

but this logic is not kicking in because it does only when tun_name_is_fixed() returns true, and it does when the dev name has a digit in it. Therefore tunG is not considered a static name and the DCO code expects it to be concatenated with a number until the first free device is found.

OTOH the code defining the name does assume that tunG is a static iface name because it is != "tun". Therefore it assumes that the existing iface is a DCO device and moves on. However the new-peer call obviously fails as the iface is a simple tun device.

So, these 2 pieces of code wanted to the check the same thing but they do it differently.

ordex avatar Aug 31 '23 20:08 ordex

As a quick workaround: if you run a config with a pre-existing tun device (mktun), run OpenVPN with --disable-dco.

I had a quick look for mktun in the OVPN 2.6 manual and found nothing about DCO or this incompatibility. manual 2.6

I was wondering if it wouldn't be easier to implement the --disable-dco functionality directly into the openvpn binary by default for v2.6 on --mktun. This would just implicitly disable DCO and would allow the tunnel not to break when using the option --- > ahh my mistake, sorry .. --disable-dco is an option for the OpenVPN tunnel, not a standalone option. So, ignore this paragraph.

As to the naming of the TUN/TAP device with respect to DCO .. I personally find it rather problematic to make assumptions to either DCO is possible or not, based on the naming of the device - just my humble opinion.

alexvelkov1 avatar Sep 01 '23 09:09 alexvelkov1

Hi,

On Fri, Sep 01, 2023 at 02:49:37AM -0700, alexvelkov1 wrote:

I was wondering if it wouldn't be easier to implement the --disable-dco functionality directly into the openvpn binary by default for v2.6 on --mktun. This would just implicitly disable DCO and would allow the tunnel not to break when using --mktun.

Well, the --mktun is a different call. When you call OpenVPN the next time, it has no memories that you did a --mktun previously, so it can't "disbale DCO on --mktun" (also, tun ifs can be created by "ip" these days).

As to the naming of the TUN/TAP device with respect to DCO .. I personally find it rather problematic to make assumptions to either DCO is possible or not, based on the naming of the device - just my humble oppinion.

Naming of tun/tap devices is a complicated story. On Linux, such a device can have any name it wants (so you can have a "tun" device named "tap3"), and people are very creative in their ways... on other OSes, much less variants are possible.

From what Ordex found, it might be possible that your use case works just fine if doing "--mktun --dev tun3", and not "tunG". I bet we only tested the numeric variant.

gert

Gert Doering - Munich, Germany @.***

cron2 avatar Sep 02 '23 09:09 cron2

Hi,

On Linux, such a device can have any name it wants (so you can have a "tun" device named "tap3"), and people are very creative in their ways... on other OSes, much less variants are possible.

Yes, that is true. I think it should be possible to use as much "functionality" as possible, according to the OS. It should not be a choice like trying to restrict to find the "Greatest common divisor" to fit all possibilities.

From what Ordex found, it might be possible that your use case works just fine if doing "--mktun --dev tun3", and not "tunG".

I can confirm that, openvpn works with 'tun3' created with --mktun without the "sitnl_send: rtnl: generic error (-17): File exists" error. Wouldn't it be easier to catch the error and continue with the status quo and try to work with it as a non-DCO device?

Even if the device was created with "ip tuntap add tunnel1 type tun", I see the message "Interface tunnel1 exists and is non-DCO. Disabling data channel offload" in the logs. Maybe it would be beneficial to have an additional option to --mktun and create the device as being a DCO device (e.g. --type dco).

Alex

alexvelkov1 avatar Oct 16 '23 15:10 alexvelkov1