plugins icon indicating copy to clipboard operation
plugins copied to clipboard

os-wireguard 2.3 - CARP - interface remain active

Open FilipK-CZ opened this issue 2 years ago • 31 comments

Important notices Before you add a new report, we ask you kindly to acknowledge the following:

  • [x] I have read the contributing guide lines at https://github.com/opnsense/plugins/blob/master/CONTRIBUTING.md
  • [x] I have searched the existing issues, open and closed, and I'm convinced that mine is new.
  • [x] The title contains the plugin to which this issue belongs

Describe the bug After activating CARP on the wireguard interface, the backup shows that the interface is deactivated but actually still works and pings

To Reproduce Steps to reproduce the behavior:

  1. Enable carp for wg interface
  2. Check VPN: WireGuard: Diagnostics -> interface is down
  3. Try ping to WG interface IP -> OK

Expected behavior

  1. Enable carp for wg interface
  2. Try ping to WG interface IP -> No route to host
  3. Try ping any peer -> No route to host

Screenshots

Relevant log files

Additional context In the case of dynamic routing, it is then not possible to access the backup from the master, because the backup still wants to use its interface

Environment

OPNsense 23.7.6-amd64 os-wireguard 2.3

FilipK-CZ avatar Oct 11 '23 22:10 FilipK-CZ

@AdSchellevis any updates? Do you need any further information?

FilipK-CZ avatar Nov 06 '23 15:11 FilipK-CZ

it's marked community support, if at some point it turns out to be a bug, we can re-label.

AdSchellevis avatar Nov 06 '23 15:11 AdSchellevis

It looks like a bug because the interface should be disabled, but it's not. This pic is from backup host and wireguard interface is still enabled wireguard

FilipK-CZ avatar Nov 14 '23 14:11 FilipK-CZ

Wireguard plugin version is 2.5 and the steps to reproduce are unclear. Please post to e exact GUI options used and attach ifconfig output.

fichtner avatar Nov 14 '23 15:11 fichtner

@fichtner I updated to version 2.5 and it's still the same. I just created a WG set CARP on it, but the wireguard interface remains active on the backup, which breaks dynamic routing.

FilipK-CZ avatar Nov 28 '23 00:11 FilipK-CZ

You haven't provided any ifconfig output on master and backup here. It's very difficult to diagnose on guessing.

fichtner avatar Nov 28 '23 07:11 fichtner

Master:

wg1: flags=80c1<UP,RUNNING,NOARP,MULTICAST> metric 0 mtu 1420
	options=80000<LINKSTATE>
	inet 172.31.255.1 netmask 0xffffff00
	groups: wg wireguard
	nd6 options=109<PERFORMNUD,IFDISABLED,NO_DAD>
wg2: flags=80c1<UP,RUNNING,NOARP,MULTICAST> metric 0 mtu 1420
	options=80000<LINKSTATE>
	inet 172.16.6.1 netmask 0xfffffff0
	groups: wg wireguard
	nd6 options=109<PERFORMNUD,IFDISABLED,NO_DAD>

ping to peer on wg1:

root@gw1:~ # ping 172.31.255.20
PING 172.31.255.20 (172.31.255.20): 56 data bytes
64 bytes from 172.31.255.20: icmp_seq=0 ttl=64 time=3.258 ms
64 bytes from 172.31.255.20: icmp_seq=1 ttl=64 time=3.545 ms
64 bytes from 172.31.255.20: icmp_seq=2 ttl=64 time=3.751 ms
64 bytes from 172.31.255.20: icmp_seq=3 ttl=64 time=3.522 ms
64 bytes from 172.31.255.20: icmp_seq=4 ttl=64 time=3.817 ms

Backup: ifconfig

wg1: flags=8080<NOARP,MULTICAST> metric 0 mtu 1420
	options=80000<LINKSTATE>
	inet 172.31.255.1 netmask 0xffffff00
	groups: wg wireguard
	nd6 options=109<PERFORMNUD,IFDISABLED,NO_DAD>
wg2: flags=8080<NOARP,MULTICAST> metric 0 mtu 1420
	options=80000<LINKSTATE>
	inet 172.16.6.1 netmask 0xfffffff0
	groups: wg wireguard
	nd6 options=109<PERFORMNUD,IFDISABLED,NO_DAD>

ping to peer on wg1:

root@gw2:~ # ping 172.31.255.20
PING 172.31.255.20 (172.31.255.20): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host

Routes for WG peers remain image

FilipK-CZ avatar Nov 28 '23 11:11 FilipK-CZ

I'm not a WireGuard expert but it looks like it's working? wg1 and wg2 are down on the backup.

fichtner avatar Nov 28 '23 11:11 fichtner

wireguard

As you can see here: wireguard interface is still visible via wg command (=somehow active). And routes for wireguard peers stay on backup

FilipK-CZ avatar Nov 28 '23 11:11 FilipK-CZ

I fail to see the point to be honest :)

fichtner avatar Nov 28 '23 11:11 fichtner

CARP should disable interface in wireguard, not just in OS interface. Then the WG command will not show the interface and the routes will disappear and dynamic routing will work fine

FilipK-CZ avatar Nov 28 '23 11:11 FilipK-CZ

It disabled WireGuard instance fine. If you don't expect "wg" to show any info if the instance is down we don't build or touch "wg" command so you need to find someone who works on "wg".

fichtner avatar Nov 28 '23 11:11 fichtner

You could also call this "hot standby" but what do I know ;)

fichtner avatar Nov 28 '23 11:11 fichtner

When I ping 172.31.255.1 (wg1 IP) from the device, both the master and backup respond from their wg1 interface. Which is wrong and only the master should respond

FilipK-CZ avatar Nov 28 '23 11:11 FilipK-CZ

Ping from the backup? Yes? And if you ping from the master?

fichtner avatar Nov 28 '23 11:11 fichtner

When I disable interface manualy it dissapires from wg command and routes. image image image

FilipK-CZ avatar Nov 28 '23 11:11 FilipK-CZ

Yep, it's a hot standby and you haven't answered my question:

Ping from the backup? Yes? And if you ping from the master?

fichtner avatar Nov 28 '23 11:11 fichtner

Ping from the backup? Yes? And if you ping from the master?

I'm not sure what you mean. I get ping replies from both, even if I disable dynamic routing on the backup, which withdraw route (To get rid of master route).

FilipK-CZ avatar Nov 28 '23 11:11 FilipK-CZ

Ok, last time. You said:

When I ping 172.31.255.1 (wg1 IP) from the device.

What is "the" "device"? The backup firewall itself?

fichtner avatar Nov 28 '23 11:11 fichtner

Yes backup and master itself.

FilipK-CZ avatar Nov 28 '23 11:11 FilipK-CZ

Now "the device" is two devices? Backup AND master?

fichtner avatar Nov 28 '23 11:11 fichtner

Yes, it was a poorly worded sentence: gw1 (master): ping 172.31.255.1 -> got response from local interface wg gw2 (backup): ping 172.31.255.1 -> got response from local interface wg

When manually disabled the wg interface on gw2: gw2 (backup): ping 172.31.255.1 -> response from gw1 via route added by dynamic routing.

FilipK-CZ avatar Nov 28 '23 12:11 FilipK-CZ

Ok, this appears to be relevant: https://www.linuxquestions.org/questions/linux-kernel-70/ping-is-successful-even-the-interface-is-down-on-linux-box-4175597480/

"The interface is down to the outside world, but the kernel is still aware of it by IP address or by device name, and it is still configured. The request comes from "inside" so it responds. You would not be able to ping it from the other side however."

fichtner avatar Nov 28 '23 12:11 fichtner

Yes, I know, but that's the problem. Wireguard doesn't have to be "hot-standby" and there's no point in using it that way. The time it takes to get from off to active is negligible and it's a more natural way

FilipK-CZ avatar Nov 28 '23 12:11 FilipK-CZ

Ok, so what's the real world downside here? I feel like we are tiptoing around a use case indicated by:

In the case of dynamic routing, it is then not possible to access the backup from the master, because the backup still wants to use its interface

But that's part of the prerequisite in setup scope and omitted in the steps to reproduce. FRR running? How so? Where is the problem over there?

fichtner avatar Nov 28 '23 12:11 fichtner

FRR is running and working as it should. It doesn't use CARP, it doesn't make sense to use it because when is BGP active it finds right way.

The whole problem is that you leave the wireguard activated and just disable the system interface which make this "weird" situations when you need to make NAT from master to backup because backup want to respond with local interface. Why you just can't disable wireguard as it should be? This problem immediately disapper

FilipK-CZ avatar Nov 28 '23 12:11 FilipK-CZ

And FRR is not in steps to reproduce because it's not about it. It could be a static route or maybe someone has other scenery. The whole point of this problem is that in fact the wireguard remains active, yes "you can't access it" but it is still active and that doesn't make sense

FilipK-CZ avatar Nov 28 '23 12:11 FilipK-CZ

@fichtner any update?

FilipK-CZ avatar Dec 26 '23 21:12 FilipK-CZ

not likely as this sounds like a (common) setup problem, if routing prevent using a path, source nat usually helps.

AdSchellevis avatar Dec 27 '23 07:12 AdSchellevis

At the risk of being flamed (I hope note) - I'll chime in and comment how I use WG and FRR on several customers, with multi WAN failover on some customers and just a single WAN on others.

Since WG changelog 2.4, WG has become great and works really, really well. I'm applying 2.6 tonight - that is OPNsense firmware 23.7.11 and the expected new benefit "consider missing CARP VHID as disabled" will also help in some situations too. I too have noticed that CARP disabled didn't stop WG and so it will be good to have that edge case closed too.

***** Highlights *****

FRR

  • I use BGP for routing, I just find it more flexible and robust that OSPF
  • "Enable CARP Failover" is NOT selected
  • I have BFD enabled

WireGuard

  • "Depend on (CARP)" is in use - I track the WAN interface (WAN1 or WAN2) as appropriate for multiWAN
  • If just a single WAN, then, I track the LAN CARP for WG and not the WAN at all.
  • "Disable routes" is selected

** My Comments ** Since the WG interfaces on the backup firewall when it's the CARP backup remain down, then the backup firewall cannot find it's BGP neighbor and thus the dynamic BGP routes do not get added to the backup firewall routing table.

Since FRR is running, as soon as the primary firewall disappears, the backup firewall becomes the CARP master and BGP on the backup firewall can suddenly find is neighbors and volia, routing starts.

I'm losing 1-2 pings during a transitions from PRIMARY firewall to BACKUP firewall - it's amazingly fast the transition - it really works!

I hope my comments help.

nzkiwi68 avatar Jan 10 '24 02:01 nzkiwi68