core
core copied to clipboard
IPv6 outbound NAT doesn't update translation target when interface address changes
Important notices
Before you add a new report, we ask you kindly to acknowledge the following:
- [x] I have read the contributing guide lines at https://github.com/opnsense/core/blob/master/CONTRIBUTING.md
- [x] I am convinced that my issue is new after having checked both open and closed issues at https://github.com/opnsense/core/issues?q=is%3Aissue
Describe the bug
When an IPv6 outbound NAT rule exists on a SLAAC WAN interface and the interface address changes, the translation target isn't updated. Instead, the outbound NAT rule keeps using the deprecated address. As a result, all NATed connections fail. Even a "disable - apply - enable - apply" of the outbound NAT rule does not fix this, it keeps using the deprecated address.
To Reproduce
Steps to reproduce the behavior:
- Go to 'Interfaces: [WAN]', set the IPv6 Configuration Type to SLAAC, save & apply
- Go to 'Firewall: NAT: Outbound', add a rule (interface WAN, TCP/IP version IPv6), save & apply
- Test the NAT by e. g. performing a ping test with the source address set to the LAN interface's address
- Wait for the WAN address to change (upstream router advertises a new prefix, WAN interface autoconfigures a new address and marks the old address as deprecated)
- Repeat the test from step 3, see error: test fails
- Perform a packet capture to verify that the source address of NATed outbound packets is indeed the old, deprecated address
Expected behavior
IPv6 outbound NAT rules should update the translation target when the address is deprecated and the interface has a new, valid address.
Describe alternatives you considered
Trigger a link down / up event on the WAN interface.
Relevant log files
root@router:~ # ifconfig hn5
hn5: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: WAN_LTE (opt8)
options=80018<VLAN_MTU,VLAN_HWTAGGING,LINKSTATE>
ether 00:12:34:56:78:9a
inet6 fe80::212:34ff:fe56:789a%hn5 prefixlen 64 scopeid 0xa
inet6 2001:db8:1:a:212:34ff:fe56:789a prefixlen 64 deprecated autoconf
inet6 2001:db8:1:b:212:34ff:fe56:789a prefixlen 64 autoconf
media: Ethernet autoselect (10Gbase-T <full-duplex>)
status: active
nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
(Packet capture shows alle NATed outbound packets have deprecated source address 2001:db8:1:a:212:34ff:fe56:789a.)
Environment
OPNsense 24.1.6 (amd64) Hyper-V Gen2
The rule in question is relevant as generated in /tmp/rules.debug
, but I'm suspecting this is a ":0" syntax oddity. SLAAC addresses are stateless so we have no real way to track... ideally the deprecated addresses should be flushed rather timely instead of lingering in cases of a new autoconf being available. This might also be due to the ordering of the addresses in the kernel BTW.
For some reason I doubt the "deprecation" process because if it's deprecated it should still be usable?! I thought that was the whole point of having it.
Cheers, Franco
root@router:~ # cat /tmp/rules.debug | grep "nat on hn5"
nat on hn5 inet6 from !(hn5) to any -> (hn5:0) port 1024:65535 # IPv6 NAT for LTE WAN
You're right, this is probably a kernel / pf issue. According to pf.conf(5), "the rule is automatically updated whenever the interface changes its address". I'd say deprecating an address while simultaneously adding a new non-deprecated one qualifies as an address change, but apparently pf doesn't think so. Not sure whether this is intentional or pf just isn't aware of the deprecation status.
There also seems to be an issue with my upstream router. When its LTE modem reconnects, the router sends RAs with both the old (now invalid) and the new prefix. The old prefix is advertised with a zero preferred lifetime (which deprecates it), but both prefixes keep getting advertised with a one hour valid lifetime. This indeed indicates that the old prefix is still usable, which is not the case. I will raise this issue with the vendor of the upstream router.
@maurice-w Thanks for confirming. I'll take a look but can't make any promises.
While I have your attention: https://forum.opnsense.org/index.php?topic=37813.msg197098#msg197098
Would you mind leaving your opinion? Removing the code would be easy, but it should be for the right reason.
Cheers, Franco
Thanks @fichtner. I performed additional testing. It not only affects deprecated addresses, but invalid / removed addresses, too:
When the upstream router stops advertising the old prefix, the old autoconf address eventually expires and gets removed. The interface then only has the new, valid address. But pf keeps using the old, non-existing address as the translation target.
We might be able to work around this, but since it seems to be a pf bug, I think this is where it should get fixed. Before I raise this issue with the pf folks, do you have any thoughts?
I'll respond to the other topic on the forum.
Cheers Maurice
When the upstream router stops advertising the old prefix, the old autoconf address eventually expires and gets removed. The interface then only has the new, valid address. But pf keeps using the old, non-existing address as the translation target.
This could mean two things:
- Does a filter reload fix it?
- If 1.) is a no this could also be a sticky state issue.
We might be able to work around this, but since it seems to be a pf bug, I think this is where it should get fixed. Before I raise this issue with the pf folks, do you have any thoughts?
Don't tell them you found the bug on OPNsense. The pf maintainer is notoriously known for blocking bug reports and even some bugfixes from getting into FreeBSD. Yes, we reached that low point a while ago already.
Cheers, Franco