HTTP/TCP packets originating from Opnsense towards LAN interfaces seem to disappear
Important notices
Before you add a new report, we ask you kindly to acknowledge the following:
- [x] I have read the contributing guide lines at https://github.com/opnsense/core/blob/master/CONTRIBUTING.md
- [x] I am convinced that my issue is new after having checked both open and closed issues at https://github.com/opnsense/core/issues?q=is%3Aissue
Describe the bug
I posted the issue here but have yet to receive a response. Hence, this report. Please accept my apologies for potentially increasing the workload of opnsense's awesome team!
Upgraded from 23.x (latest release for business) to 24.4.1. Still on ISC DHCP with only IPv4; KEA not yet enabled. Intra- and Inter- VLAN packet routing works like a charm. No configuration or rule changes were made since. However, any outbound TCP (or sometimes UDP) packets in FW-->(any)VLAN direction on standard (e.g. 80, 443) or non-standard (e.g. 8080) ports do not get processed. ICMP pings to the same LAN hosts are fine. All other intra-VLAN nodes, inter-VLAN nodes, and even external nodes (e.g. web client on a cellular connection) can reach the address:ports in question. Just not the packets originating from FW. Example of such packets are telegraf metrics, crowdsec-lapi requests, or even simple requests from ssh/console like curl -vi --connect-timeout n <url> timeout.
Reverting to previous environment restores normalcy.
Steps to demonstrate issue
- 3 hosts in the mix here to illustrate the issue. a) Opnsense FW b) homeassistant (192.168.0.58) c) portainer (192.168.100.6)
- Check reachability of hosts on VLAN 3 (opt7, main, IP range 192.168.0.0/23) and then VLAN 100 (opt6, IP range 192.168.100.0/24) from Opnsense FW.
ping 192.168.0.58
PING 192.168.0.58 (192.168.0.58): 56 data bytes
64 bytes from 192.168.0.58: icmp_seq=0 ttl=64 time=0.409 ms
...
ping 192.168.100.6
PING 192.168.100.6 (192.168.100.6): 56 data bytes
64 bytes from 192.168.100.6: icmp_seq=0 ttl=64 time=0.369 ms
..
- Check routing towards the same hosts from Opnsense FW.
traceroute 192.168.0.58
traceroute to 192.168.0.58 (192.168.0.58), 64 hops max, 40 byte packets
1 homeassistant (192.168.0.58) 0.464 ms 0.167 ms 0.175 ms
...
traceroute 192.168.100.6
traceroute to 192.168.100.6 (192.168.100.6), 64 hops max, 40 byte packets
1 portainer (192.168.100.6) 0.359 ms 0.133 ms 0.241 ms
- Check for port
443exposure and attempt a simplecurlrequest from one host to another - to test inter-VLAN reachability at application level. It works as intended.
@portainer:~$ nc -4nzvw 5 192.168.0.58 443
Connection to 192.168.0.58 443 port [tcp/*] succeeded!
@portainer:~$ nc -4nuzvw 5 192.168.0.58 443
(base) maumau@portainer:~$
<!-- Notice here that HTTP3 over UDP/QUIC is disabled on web-servers
<!-- from homeassistant host
# netstat -tulnp | grep 443
tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 10436/docker-proxy
tcp 0 0 :::443 :::* LISTEN 10444/docker-proxy
-->
@portainer:~$ curl -ki --connect-timeout 5 https://homeassistant.esco.ghaar
HTTP/2 200
server: nginx
date: Sat, 06 Jul 2024 18:48:24 GMT
content-type: text/html; charset=utf-8
content-length: 4148
referrer-policy: no-referrer
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=31536000; includeSubDomains
... (truncated the full HTTP response)
- Now, execute step 3 by replacing one of the hosts with Opnsense firewall
nc -4nzvw 5 192.168.0.58 443
nc: connect to 192.168.0.58 port 443 (tcp) failed: Operation timed out
nc -4nuzvw 5 192.168.0.58 443
Connection to 192.168.0.58 443 port [udp/*] succeeded!
...
<!--
Opnsense's interpretation of 443 being opened for UDP is wrong here. As in step 3, only TCP on 443 is active!
-->
...
# curl -kvi --connect-timeout 5 https://homeassistant.esco.ghaar
* Host homeassistant.esco.ghaar:443 was resolved.
* IPv6: (none)
* IPv4: 192.168.0.58
* Trying 192.168.0.58:443...
* ipv4 connect timeout after 4999ms, move on!
* Failed to connect to homeassistant.esco.ghaar port 443 after 5002 ms: Timeout was reached
* Closing connection
curl: (28) Failed to connect to homeassistant.esco.ghaar port 443 after 5002 ms: Timeout was reached
Expected behavior
Opnsense should establish HTTPS/TCP connection, or even basic TCP connection to hosts on various VLANs.
Describe alternatives you considered
Please refer to above. Reverting to Opnsense 23.x (last known business version) restores normalcy.
Screenshots
If applicable, add screenshots to help explain your problem.
Relevant log files
If applicable, information from log files supporting your claim.
Additional context
Few additional considerations which i grappled with:
- Does this issue always happen? Mostly yes, but every so often (once in a few hours or so) a TCP request in FW-->Host direction slips through. Evidence:
#nc -4znvw 10 192.168.0.58 443
Connection to 192.168.0.58 443 port [tcp/*] succeeded!
<!-- immediately following which another series of requests fail -->
...
#nc -4znvw 10 192.168.0.58 443
nc: connect to 192.168.0.58 port 443 (tcp) failed: Operation timed out
# nc -4znvw 10 192.168.0.58 443
nc: connect to 192.168.0.58 port 443 (tcp) failed: Operation timed out
- What is going on at TCP/IPv4 level? a. Are the FW originating HTTP(s) over TCP packets sent over the wire? Yes b. If so, is the switch network eating it up? (due to say bad VLAN configuration No c. Is the receiving host not responding at TCP level? ** No**. Receiving host does issue TCP SYN ACKs d. Are receiving hosts packets blocked by FW rules? Answer: No e. Are receiving hosts packets received at the FW interface? Answer: Yes
Packet capture from opt6 (VLAN 100) to illustrate packet behavior can be seen below. Same behavior for main/opt7 interface is also observed. :
Servers
vlan0.100 2024-06-28
07:37:50.442037 f4:90:ea:00:9f:72 00:50:56:82:d8:b4 ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.1.31315 > 192.168.100.21.8080: Flags [S], cksum 0x8070 (correct), seq 445912424, win 65535, options [mss 8960,nop,wscale 12,sackOK,TS val 1292126707 ecr 0], length 0
Servers
vlan0.100 2024-06-28
07:37:50.442400 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xe967 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838080763 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:51.442697 f4:90:ea:00:9f:72 00:50:56:82:d8:b4 ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.1.31315 > 192.168.100.21.8080: Flags [S], cksum 0x7c87 (correct), seq 445912424, win 65535, options [mss 8960,nop,wscale 12,sackOK,TS val 1292127708 ecr 0], length 0
Servers
vlan0.100 2024-06-28
07:37:51.443231 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xe57e (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838081764 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:52.462713 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xe182 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838082784 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:53.642675 f4:90:ea:00:9f:72 00:50:56:82:d8:b4 ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.1.31315 > 192.168.100.21.8080: Flags [S], cksum 0x73ef (correct), seq 445912424, win 65535, options [mss 8960,nop,wscale 12,sackOK,TS val 1292129908 ecr 0], length 0
Servers
vlan0.100 2024-06-28
07:37:53.643161 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xdce6 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838083964 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:55.662758 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xd502 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838085984 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:57.842474 f4:90:ea:00:9f:72 00:50:56:82:d8:b4 ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.1.31315 > 192.168.100.21.8080: Flags [S], cksum 0x6387 (correct), seq 445912424, win 65535, options [mss 8960,nop,wscale 12,sackOK,TS val 1292134108 ecr 0], length 0
Servers
vlan0.100 2024-06-28
07:37:57.842885 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xcc7e (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838088164 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:38:01.966765 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xbc62 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838092288 ecr 1292126707,nop,wscale 9], length 0
- Is there an MTU mismatch issue here? Step /2/ shows FW with
mtu:9000(downsampled at ether layer) with hostmtu:1500(downsampled at ether layer). Changing host mtu to be same as FW results in same behavior. Please see attachedtcpdump.zipwhich includespcapandjson.
- Could there be an intermittent issue w/ network drivers for the
icecard involved? ** Doesn't seem so**, Also, if that were to be the case then it should manifest across all VLANs? There are certain oddities about ice messages but these seemed also to be there in 23.x (if memory serves me right).
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.2-RELEASE-p11 stable/24.1-n255023-99a14409566 SMP amd64
FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c)
VT(vga): resolution 640x480
CPU: AMD EPYC 3251 8-Core Processor (2495.44-MHz K8-class CPU)
Origin="AuthenticAMD" Id=0x800f12 Family=0x17 Model=0x1 Stepping=2
Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
AMD Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
Structured Extended Features=0x209c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
AMD Extended Feature Extensions ID EBX=0x1007<CLZERO,IRPerf,XSaveErPtr,IBPB>
SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
TSC: P-state invariant, performance statistics
real memory = 68717379584 (65534 MB)
avail memory = 66675605504 (63586 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <INSYDE WALLABY>
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 1 package(s) x 2 cache groups x 4 core(s) x 2 hardware threads
FreeBSD/SMP Online: 1 package(s) x 2 cache groups x 4 core(s)
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
random: unblocking device.
ioapic0: MADT APIC ID 128 != hw id 0
ioapic1: MADT APIC ID 129 != hw id 0
ioapic0 <Version 2.1> irqs 0-23
ioapic1 <Version 2.1> irqs 24-55
Launching APs: 7 5 1 3 2 6 4
random: entropy device external interface
wlan: mac acl policy registered
kbd0 at kbdmux0
WARNING: Device "spkr" is Giant locked and may be deleted before FreeBSD 14.0.
vtvga0: <VT VGA driver>
efirtc0: <EFI Realtime Clock>
efirtc0: registered as a time-of-day clock, resolution 1.000000s
smbios0: <System Management BIOS> at iomem 0x7945e000-0x7945e01e
smbios0: Version: 3.0, BCD Revision: 3.0
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS,SHA1,SHA256>
acpi0: <INSYDE WALLABY>
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8 on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 350
Event timer "HPET1" frequency 14318180 Hz quality 350
Event timer "HPET2" frequency 14318180 Hz quality 350
atrtc0: <AT realtime clock> port 0x70-0x71 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <AT timer> port 0x40-0x43 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
apei0: <ACPI Platform Error Interface> on acpi0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <base peripheral, IOMMU> at device 0.2 (no driver attached)
pcib1: <ACPI PCI-PCI bridge> at device 1.3 on pci0
pci1: <ACPI PCI bus> on pcib1
nvme0: <Generic NVMe Device> mem 0x80900000-0x80903fff at device 0.0 on pci1
pcib2: <ACPI PCI-PCI bridge> at device 1.4 on pci0
pci2: <ACPI PCI bus> on pcib2
igb0: <Intel(R) I210 Flashless (Copper)> port 0x5000-0x501f mem 0x80800000-0x8081ffff,0x80820000-0x80823fff at device 0.0 on pci2
igb0: NVM V0.6 imgtype6
igb0: Using 1024 TX descriptors and 1024 RX descriptors
igb0: Using 4 RX queues 4 TX queues
igb0: Using MSI-X interrupts with 5 vectors
igb0: Ethernet address: f4:90:ea:00:a2:06
igb0: netmap queues/slots: TX 4/1024, RX 4/1024
pcib3: <ACPI PCI-PCI bridge> at device 1.5 on pci0
pci3: <ACPI PCI bus> on pcib3
igb1: <Intel(R) I210 Flashless (Copper)> port 0x4000-0x401f mem 0x80700000-0x8071ffff,0x80720000-0x80723fff at device 0.0 on pci3
igb1: NVM V0.6 imgtype6
igb1: Using 1024 TX descriptors and 1024 RX descriptors
igb1: Using 4 RX queues 4 TX queues
igb1: Using MSI-X interrupts with 5 vectors
igb1: Ethernet address: f4:90:ea:00:a2:07
igb1: netmap queues/slots: TX 4/1024, RX 4/1024
pcib4: <ACPI PCI-PCI bridge> at device 1.6 on pci0
pci4: <ACPI PCI bus> on pcib4
igb2: <Intel(R) I210 Flashless (Copper)> port 0x3000-0x301f mem 0x80600000-0x8061ffff,0x80620000-0x80623fff at device 0.0 on pci4
igb2: NVM V0.6 imgtype6
igb2: Using 1024 TX descriptors and 1024 RX descriptors
igb2: Using 4 RX queues 4 TX queues
igb2: Using MSI-X interrupts with 5 vectors
igb2: Ethernet address: f4:90:ea:00:a2:08
igb2: netmap queues/slots: TX 4/1024, RX 4/1024
pcib5: <ACPI PCI-PCI bridge> at device 1.7 on pci0
pci5: <ACPI PCI bus> on pcib5
igb3: <Intel(R) I210 Flashless (Copper)> port 0x2000-0x201f mem 0x80500000-0x8051ffff,0x80520000-0x80523fff at device 0.0 on pci5
igb3: NVM V0.6 imgtype6
igb3: Using 1024 TX descriptors and 1024 RX descriptors
igb3: Using 4 RX queues 4 TX queues
igb3: Using MSI-X interrupts with 5 vectors
igb3: Ethernet address: f4:90:ea:00:a2:09
igb3: netmap queues/slots: TX 4/1024, RX 4/1024
pcib6: <ACPI PCI-PCI bridge> at device 3.1 on pci0
pci6: <ACPI PCI bus> on pcib6
ice0: <Intel(R) Ethernet Network Adapter E810-XXV-2 - 1.37.11-k> mem 0x7fcfc000000-0x7fcfdffffff,0x7fcfe010000-0x7fcfe01ffff at device 0.0 on pci6
ice0: Loading the iflib ice driver
ice0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.30.0, track id 0xc0000001.
ice0: fw 6.2.9 api 1.7 nvm 3.20 etid 8000d853 netlist 3.20.5000-1.e.0.495c77bc oem 1.3146.0
ice0: Using 8 Tx and Rx queues
ice0: Reserving 8 MSI-X interrupts for iRDMA
ice0: Using MSI-X interrupts with 17 vectors
ice0: Using 1024 TX descriptors and 1024 RX descriptors
ice0: Ethernet address: f4:90:ea:00:9f:72
ice0: PCI Express Bus: Speed 8.0GT/s Width x8
ice0: Firmware LLDP agent disabled
ice0: link state changed to UP
ice0: Link is up, 25 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: FC-FEC/BASE-R, Autoneg: False, Flow Control: None
ice0: netmap queues/slots: TX 8/1024, RX 8/1024
ice1: <Intel(R) Ethernet Network Adapter E810-XXV-2 - 1.37.11-k> mem 0x7fcfa000000-0x7fcfbffffff,0x7fcfe000000-0x7fcfe00ffff at device 0.1 on pci6
ice1: Loading the iflib ice driver
ice1: DDP package already present on device: ICE OS Default Package version 1.3.30.0, track id 0xc0000001.
ice1: fw 6.2.9 api 1.7 nvm 3.20 etid 8000d853 netlist 3.20.5000-1.e.0.495c77bc oem 1.3146.0
ice1: Using 8 Tx and Rx queues
ice1: Reserving 8 MSI-X interrupts for iRDMA
ice1: Using MSI-X interrupts with 17 vectors
ice1: Using 1024 TX descriptors and 1024 RX descriptors
ice1: Ethernet address: f4:90:ea:00:9f:73
ice1: PCI Express Bus: Speed 8.0GT/s Width x8
ice1: Firmware LLDP agent disabled
ice1: link state changed to UP
ice1: Link is up, 25 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: FC-FEC/BASE-R, Autoneg: False, Flow Control: None
ice1: netmap queues/slots: TX 8/1024, RX 8/1024
pcib7: <ACPI PCI-PCI bridge> at device 7.1 on pci0
pci7: <ACPI PCI bus> on pcib7
pci7: <unknown> at device 0.0 (no driver attached)
pci7: <encrypt/decrypt> at device 0.2 (no driver attached)
xhci0: <XHCI (generic) USB 3.0 controller> mem 0x80200000-0x802fffff at device 0.3 on pci7
xhci0: 64 bytes context size, 64-bit DMA
usbus0: waiting for BIOS to give up control
ice1: Module is not present.
ice1: Possible Solution 1: Check that the module is inserted correctly.
ice1: Possible Solution 2: If the problem persists, use a cable/module that is found in the supported modules and cables list for this device.
ice1: link state changed to DOWN
usbus0 on xhci0
usbus0: 5.0Gbps Super Speed USB v3.0
pcib8: <ACPI PCI-PCI bridge> at device 8.1 on pci0
pci8: <ACPI PCI bus> on pcib8
pci8: <unknown> at device 0.0 (no driver attached)
pci8: <encrypt/decrypt> at device 0.1 (no driver attached)
hdac0: <AMD X370 HDA Controller> mem 0x80180000-0x80187fff at device 0.3 on pci8
ax0: <AMD 10 Gigabit Ethernet Driver> mem 0x80160000-0x8017ffff,0x80140000-0x8015ffff,0x80188000-0x80189fff at device 0.4 on pci8
ax0: Using 2048 TX descriptors and 2048 RX descriptors
ax0: Using 8 RX queues 8 TX queues
ax0: Using MSI-X interrupts with 12 vectors
ax0: Ethernet address: f4:90:ea:00:a2:0a
ax0: xgbe_config_sph_mode: SPH disabled in channel 0
ax0: xgbe_config_sph_mode: SPH disabled in channel 1
ax0: xgbe_config_sph_mode: SPH disabled in channel 2
ax0: xgbe_config_sph_mode: SPH disabled in channel 3
ax0: xgbe_config_sph_mode: SPH disabled in channel 4
ax0: xgbe_config_sph_mode: SPH disabled in channel 5
ax0: xgbe_config_sph_mode: SPH disabled in channel 6
ax0: xgbe_config_sph_mode: SPH disabled in channel 7
ax0: RSS Enabled
ax0: Receive checksum offload Enabled
ax0: VLAN filtering Enabled
ax0: VLAN Stripping Enabled
ax0: Checking GPIO expander validity
ax0: GPIO configuration valid
ax0: xgbe_phy_sfp_signals: port_sfp_inputs: 0x7
ax0: xgbe_phy_sfp_detect: mod absent
ax0: netmap queues/slots: TX 8/2048, RX 8/2048
ax1: <AMD 10 Gigabit Ethernet Driver> mem 0x80120000-0x8013ffff,0x80100000-0x8011ffff,0x8018a000-0x8018bfff at device 0.5 on pci8
ax1: Using 2048 TX descriptors and 2048 RX descriptors
ax1: Using 8 RX queues 8 TX queues
ax1: Using MSI-X interrupts with 12 vectors
ax1: Ethernet address: f4:90:ea:00:a2:0b
ax1: xgbe_config_sph_mode: SPH disabled in channel 0
ax1: xgbe_config_sph_mode: SPH disabled in channel 1
ax1: xgbe_config_sph_mode: SPH disabled in channel 2
ax1: xgbe_config_sph_mode: SPH disabled in channel 3
ax1: xgbe_config_sph_mode: SPH disabled in channel 4
ax1: xgbe_config_sph_mode: SPH disabled in channel 5
ax1: xgbe_config_sph_mode: SPH disabled in channel 6
ax1: xgbe_config_sph_mode: SPH disabled in channel 7
ax1: RSS Enabled
ax1: Receive checksum offload Enabled
ax1: VLAN filtering Enabled
ax1: VLAN Stripping Enabled
ax1: Checking GPIO expander validity
ax1: GPIO configuration valid
ax1: xgbe_phy_sfp_signals: port_sfp_inputs: 0x7
ax1: xgbe_phy_sfp_detect: mod absent
ax1: netmap queues/slots: TX 8/2048, RX 8/2048
isab0: <PCI-ISA bridge> at device 20.3 on pci0
isa0: <ISA bus> on isab0
uart2: <16x50 with 256 byte FIFO> iomem 0xfedc9000-0xfedc9fff,0xfedc7000-0xfedc7fff irq 3 on acpi0
uart2: console (115384,n,8,1)
hwpstate0: <Cool`n'Quiet 2.0> on cpu0
Timecounter "TSC-low" frequency 1247655967 Hz quality 1000
Timecounters tick every 1.000 msec
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
ugen0.1: <AMD XHCI root HUB> at usbus0
uhub0 on usbus0
uhub0: <AMD XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
nvd0: <TS1TMTE662T2> NVMe namespace
nvd0: 976762MB (2000409264 512 byte sectors)
Trying to mount root from zfs:zroot/ROOT/default []...
uhub0: 8 ports with 8 removable, self powered
ice1: link state changed to UP
ice1: Link is up, 25 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: FC-FEC/BASE-R, Autoneg: False, Flow Control: None
igb1: link state changed to UP
ice0: Module is not present.
ice0: Possible Solution 1: Check that the module is inserted correctly.
ice0: Possible Solution 2: If the problem persists, use a cable/module that is found in the supported modules and cables list for this device.
ice0: Link is up, 25 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: FC-FEC/BASE-R, Autoneg: False, Flow Control: None
intsmb0: <AMD FCH SMBus Controller> at device 20.0 on pci0
smbus0: <System Management Bus> on intsmb0
driver bug: Unable to set devclass (class: ppc devname: (unknown))
ig4iic0: <Designware I2C Controller> iomem 0xfedc2000-0xfedc2fff irq 10 on acpi0
iicbus0: <Philips I2C bus (ACPI-hinted)> on ig4iic0
ig4iic1: <Designware I2C Controller> iomem 0xfedc3000-0xfedc3fff irq 11 on acpi0
iicbus1: <Philips I2C bus (ACPI-hinted)> on ig4iic1
ig4iic2: <Designware I2C Controller> iomem 0xfedc4000-0xfedc4fff irq 12 on acpi0
iicbus2: <Philips I2C bus (ACPI-hinted)> on ig4iic2
ig4iic3: <Designware I2C Controller> iomem 0xfedc5000-0xfedc5fff irq 6 on acpi0
iicbus3: <Philips I2C bus (ACPI-hinted)> on ig4iic3
ig4iic4: <Designware I2C Controller> iomem 0xfedc6000-0xfedc6fff irq 14 on acpi0
iicbus4: <Philips I2C bus (ACPI-hinted)> on ig4iic4
ig4iic5: <Designware I2C Controller> iomem 0xfedcb000-0xfedcbfff irq 15 on acpi0
iicbus5: <Philips I2C bus (ACPI-hinted)> on ig4iic5
lo0: link state changed to UP
amdsmn0: <AMD Family 17h System Management Network> on hostb0
amdtemp0: <AMD CPU On-Die Thermal Sensors> on hostb0
pflog0: permanently promiscuous mode enabled
lagg0: link state changed to UP
vlan0: changing name to 'vlan0.1'
vlan1: changing name to 'vlan0.100'
ice0: Failed to add VLAN filters:
ice0: - vlan 100, status -14
ice0: Failure adding VLAN 100 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 100, status -14
ice1: Failure adding VLAN 100 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan2: changing name to 'vlan0.120'
ice0: Failed to add VLAN filters:
ice0: - vlan 120, status -14
ice0: Failure adding VLAN 120 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 120, status -14
ice1: Failure adding VLAN 120 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan3: changing name to 'vlan0.121'
ice0: Failed to add VLAN filters:
ice0: - vlan 121, status -14
ice0: Failure adding VLAN 121 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 121, status -14
ice1: Failure adding VLAN 121 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan4: changing name to 'vlan0.140'
ice0: Failed to add VLAN filters:
ice0: - vlan 140, status -14
ice0: Failure adding VLAN 140 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 140, status -14
ice1: Failure adding VLAN 140 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan5: changing name to 'vlan0.2'
ice0: Failed to add VLAN filters:
ice0: - vlan 2, status -14
ice0: Failure adding VLAN 2 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 2, status -14
ice1: Failure adding VLAN 2 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan6: changing name to 'vlan0.250'
ice0: Failed to add VLAN filters:
ice0: - vlan 250, status -14
ice0: Failure adding VLAN 250 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 250, status -14
ice1: Failure adding VLAN 250 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan7: changing name to 'vlan0.3'
ice0: Failed to add VLAN filters:
ice0: - vlan 3, status -14
ice0: Failure adding VLAN 3 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 3, status -14
ice1: Failure adding VLAN 3 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
igb1: link state changed to DOWN
igb1: link state changed to UP
wg0: changing name to 'wg1'
wg1: link state changed to UP
tun1: changing name to 'ovpns1'
tun2: changing name to 'ovpns2'
tun3: changing name to 'ovpns3'
ovpns3: link state changed to UP
WARNING: attempt to domain_add(netgraph) after domainfinalize()
ovpns3: link state changed to DOWN
Trying to mount root from zfs:zroot/ROOT/default []...
lagg0: link state changed to DOWN
vlan0.1: link state changed to DOWN
vlan0.2: link state changed to DOWN
vlan0.100: link state changed to DOWN
vlan0.3: link state changed to DOWN
vlan0.140: link state changed to DOWN
vlan0.250: link state changed to DOWN
vlan0.121: link state changed to DOWN
vlan0.120: link state changed to DOWN
vlan0: changing name to 'vlan0.1'
ice0: Failed to add VLAN filters:
ice0: - vlan 1, status -14
ice0: Failure adding VLAN 1 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 1, status -14
ice1: Failure adding VLAN 1 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan1: changing name to 'vlan0.100'
ice0: Failed to add VLAN filters:
ice0: - vlan 100, status -14
ice0: Failure adding VLAN 100 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 100, status -14
ice1: Failure adding VLAN 100 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan2: changing name to 'vlan0.120'
ice0: Failed to add VLAN filters:
ice0: - vlan 120, status -14
ice0: Failure adding VLAN 120 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 120, status -14
ice1: Failure adding VLAN 120 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan3: changing name to 'vlan0.121'
ice0: Failed to add VLAN filters:
ice0: - vlan 121, status -14
ice0: Failure adding VLAN 121 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 121, status -14
ice1: Failure adding VLAN 121 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan4: changing name to 'vlan0.140'
ice0: Failed to add VLAN filters:
ice0: - vlan 140, status -14
ice0: Failure adding VLAN 140 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 140, status -14
ice1: Failure adding VLAN 140 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan5: changing name to 'vlan0.2'
ice0: Failed to add VLAN filters:
ice0: - vlan 2, status -14
ice0: Failure adding VLAN 2 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 2, status -14
ice1: Failure adding VLAN 2 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan6: changing name to 'vlan0.250'
ice0: Failed to add VLAN filters:
ice0: - vlan 250, status -14
ice0: Failure adding VLAN 250 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 250, status -14
ice1: Failure adding VLAN 250 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
vlan7: changing name to 'vlan0.3'
ice0: Failed to add VLAN filters:
ice0: - vlan 3, status -14
ice0: Failure adding VLAN 3 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ice1: Failed to add VLAN filters:
ice1: - vlan 3, status -14
ice1: Failure adding VLAN 3 to main VSI, err ICE_ERR_ALREADY_EXISTS aq_err OK
ovpns3: link state changed to UP
lagg0: link state changed to UP
vlan0.1: link state changed to UP
vlan0.2: link state changed to UP
vlan0.100: link state changed to UP
vlan0.3: link state changed to UP
vlan0.140: link state changed to UP
vlan0.250: link state changed to UP
vlan0.121: link state changed to UP
vlan0.120: link state changed to UP
igb1: link state changed to DOWN
igb1: link state changed to UP
igb1: link state changed to DOWN
igb1: link state changed to UP
ice0: promiscuous mode enabled
ice1: promiscuous mode enabled
lagg0: promiscuous mode enabled
vlan0.100: promiscuous mode enabled
ice0: promiscuous mode disabled
ice1: promiscuous mode disabled
lagg0: promiscuous mode disabled
vlan0.100: promiscuous mode disabled
Environment
Version 24.4.1 Architecture amd64 Commit 77b950d6f Mirror https://opnsense-update.deciso.com/${SUBSCRIPTION}/FreeBSD:13:amd64/24.4 CPU AMD EPYC 3251 8-Core Processor (8 cores, 8 threads) HW DEC4040
if 23.10.x works as expected but 24.1 doesn't, the first question is what the difference is in generated ruleset (/tmp/rules.debug). Would it be possible to collect both on the exact same configuration? Common issues in these cases relate to reply-to and route-to rules .
@AdSchellevis Many thanks for the prompt reply!
firmware-->changelog history:
Version Date
24.4.1 (installed) 2024-06-20
24.4 2024-04-30
23.10.3 2024-03-28
Now I notice that 23.x boot environment is no longer visible. I only see the base snapshot manually created my me back in 2022!
bectl list -a
BE/Dataset/Snapshot Active Mountpoint Space Created
default
zroot/ROOT/default NR / 8.44G 2022-10-25 01:28
Perhaps the reboot after 24.x cleared previous boot environments?
What would the sequence of involved steps to furnish you with appropriate /tmp/rules.debug now be?
0. perform bectl create 24.4.1-buggy
- perform fresh install of 23.10 via console (nano image) or
opnsense-revert -kr 23.10 - import current config
- save /tmp/rules.debug
- hopefully, if its a working system at /1/ then follow normal upgrade system prompts via web GUI
- save /tmp/rules.debug
- compare and share here
A reinstall with 23.10 and update to the latest version in the 23.10 branch would be best indeed, then test if the issue is indeed not there and collect evidence.
If you want to exclude pf as most likely cause of your issue before reinstalling, I sometimes temporary disable pf as well using pfctl -d then check local connectivity and enable it again (pfctl -d). When local communication works without pf enabled, it's either a policy based routing rule (route-to/reply-to) or a nat issue.
Thank @AdSchellevis. I'll be remote (to FW in question) for the next few weeks. So, a full blown downgrade w/ console access might be a tall order until then.
The following didn't make a dent in the issue:
- Disable pf using
pfctl -d - curl failed
root@MorikCage:~ # curl -vi --connect-timeout 5 http://homeassistant.esco.ghaar
* Host homeassistant.esco.ghaar:80 was resolved.
* IPv6: (none)
* IPv4: 192.168.0.58
* Trying 192.168.0.58:80...
* ipv4 connect timeout after 4999ms, move on!
* Failed to connect to homeassistant.esco.ghaar port 80 after 5002 ms: Timeout was reached
* Closing connection
curl: (28) Failed to connect to homeassistant.esco.ghaar port 80 after 5002 ms: Timeout was reached
- Tried /2/ again after a few minutes. Same result. Although i suspect state tables were still active?
- Enabled pf via
pfctl -e
I could also attempt a step 2.5 with pfctl -F all to see whether that helps. But, I doubt it as my rules haven't changed between the two releases.
Would there be any other files besides:
/tmp/rules.debugdmesg >> /tmp/dmsg.outfrom 23.10 which you'd recommend for collection to aid in the diagnosis? I ask so that when I do the downgrade I can collect the right information.
well, if it doesn't work when pf is disabled in full (state table doesn't matter in that case), the question is if there is a valid default gateway installed and the interface used has a valid netmask.
netstat -nr4
ifconfig
- LAN (from any host on any VLAN) <--> LAN (from any host on any VLAN), rules permitting, works as expected
- LAN (from any host on any VLAN) --> WAN (e.g. google, bing, my own external websites) work wonderfully as well.
- Reverse direction also works for HTTP (80-->443), HTTPS (443) and Wireguard port. No other ports are exposed externally. Meaning WAN --> LAN (via NAT redirect) works as expected. which is why this problem is puzzling.
# netstat -nr4
Routing tables
Internet:
Destination Gateway Flags Netif Expire
default 76.214.40.1 UGS igb1
10.10.0.0/24 link#25 U ovpns3
10.10.0.1 link#25 UHS lo0
10.20.30.0/24 link#22 U wg1
10.20.30.1 link#22 UHS lo0
10.20.30.2 link#22 UHS wg1
10.20.30.3 link#22 UHS wg1
10.20.30.4 link#22 UHS wg1
76.214.40.0/22 link#2 U igb1
76.214.40.228 link#2 UHS lo0
127.0.0.1 link#10 UH lo0
192.168.0.0/23 link#21 U vlan0.3
192.168.0.1 link#21 UHS lo0
192.168.2.0/27 link#19 U vlan0.2
192.168.2.1 link#19 UHS lo0
192.168.98.0/24 link#13 U lagg0
192.168.98.1 link#13 UHS lo0
192.168.99.0/24 link#1 U igb0
192.168.99.1 link#1 UHS lo0
192.168.100.0/24 link#15 U vlan0.10
192.168.100.1 link#15 UHS lo0
192.168.120.0/24 link#16 U vlan0.12
192.168.120.1 link#16 UHS lo0
192.168.121.0/24 link#17 U vlan0.12
192.168.121.1 link#17 UHS lo0
192.168.140.0/24 link#18 U vlan0.14
192.168.140.1 link#18 UHS lo0
192.168.250.0/24 link#20 U vlan0.25
192.168.250.1 link#20 UHS lo0
# ping www.google.com
PING www.google.com (142.250.68.68): 56 data bytes
64 bytes from 142.250.68.68: icmp_seq=0 ttl=116 time=5.891 ms
64 bytes from 142.250.68.68: icmp_seq=1 ttl=116 time=7.310 ms
^C
--- www.google.com ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 5.891/6.601/7.310/0.709 ms
#ifconfig
igb0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: LAN (lan)
options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
inet 192.168.99.1 netmask 0xffffff00 broadcast 192.168.99.255
groups: FG_ALL_VLANs FG_CRITICAL_LAN
media: Ethernet autoselect
status: no carrier
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: Morik_WAN (wan)
options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
inet 76.214.40.228 netmask 0xfffffc00 broadcast 76.214.43.255
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb2: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
media: Ethernet autoselect
status: no carrier
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb3: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
media: Ethernet autoselect
status: no carrier
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ice0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
media: Ethernet autoselect (25G-AUI <full-duplex>)
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ice1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
media: Ethernet autoselect (25G-AUI <full-duplex>)
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ax0: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
media: Ethernet autoselect
status: no carrier
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ax1: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
media: Ethernet autoselect
status: no carrier
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
enc0: flags=0<> metric 0 mtu 1536
groups: enc
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa
inet 127.0.0.1 netmask 0xff000000
groups: lo
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=20100<PROMISC,PPROMISC> metric 0 mtu 33160
groups: pflog
pfsync0: flags=0<> metric 0 mtu 1500
syncpeer: 0.0.0.0 maxupd: 128 defer: off
syncok: 1
groups: pfsync
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
description: main_LAGG (opt1)
options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
inet 192.168.98.1 netmask 0xffffff00 broadcast 192.168.98.255
laggproto lacp lagghash l2,l3,l4
laggport: ice0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: ice1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
groups: lagg FG_ALL_VLANs FG_CRITICAL_LAN
media: Ethernet autoselect
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan0.1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
options=4000000<NOMAP>
groups: vlan
vlan: 1 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan0.100: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
description: Servers (opt6)
options=4000000<NOMAP>
inet 192.168.100.1 netmask 0xffffff00 broadcast 192.168.100.255
groups: vlan FG_ALL_VLANs FG_CRITICAL_LAN
vlan: 100 vlanproto: 802.1q vlanpcp: 7 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan0.120: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
description: Storage (opt3)
options=4000000<NOMAP>
ether f4:90:ea:00:9f:72
inet 192.168.120.1 netmask 0xffffff00 broadcast 192.168.120.255
groups: vlan FG_ALL_VLANs FG_CRITICAL_LAN
vlan: 120 vlanproto: 802.1q vlanpcp: 2 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan0.121: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
description: Storage_Backup (opt9)
options=4000000<NOMAP>
inet 192.168.121.1 netmask 0xffffff00 broadcast 192.168.121.255
groups: vlan FG_ALL_VLANs FG_CRITICAL_LAN
vlan: 121 vlanproto: 802.1q vlanpcp: 2 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan0.140: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
description: Supervisor (opt4)
options=4000000<NOMAP>
inet 192.168.140.1 netmask 0xffffff00 broadcast 192.168.140.255
groups: vlan FG_ALL_VLANs FG_CRITICAL_LAN
vlan: 140 vlanproto: 802.1q vlanpcp: 2 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan0.2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
description: vCamsTraffic (opt2)
options=4000000<NOMAP>
inet 192.168.2.1 netmask 0xffffffe0 broadcast 192.168.2.31
groups: vlan FG_ALL_VLANs
vlan: 2 vlanproto: 802.1q vlanpcp: 1 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan0.250: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
description: IoT (opt5)
options=4000000<NOMAP>
inet 192.168.250.1 netmask 0xffffff00 broadcast 192.168.250.255
groups: vlan FG_ALL_VLANs
vlan: 250 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan0.3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: main (opt7)
options=4000000<NOMAP>
inet 192.168.0.1 netmask 0xfffffe00 broadcast 192.168.1.255
groups: vlan FG_ALL_VLANs FG_CRITICAL_LAN
vlan: 3 vlanproto: 802.1q vlanpcp: 2 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
wg1: flags=80c1<UP,RUNNING,NOARP,MULTICAST> metric 0 mtu 1420
description: i_wireguard (opt10)
options=80000<LINKSTATE>
inet 10.20.30.1 netmask 0xffffff00
groups: wg wireguard
nd6 options=9<PERFORMNUD,IFDISABLED>
ovpns1: flags=8010<POINTOPOINT,MULTICAST> metric 0 mtu 1500
options=80000<LINKSTATE>
groups: tun openvpn
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ovpns2: flags=8010<POINTOPOINT,MULTICAST> metric 0 mtu 1500
options=80000<LINKSTATE>
groups: tun openvpn
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ovpns3: flags=8043<UP,BROADCAST,RUNNING,MULTICAST> metric 0 mtu 1500
options=80000<LINKSTATE>
inet 10.10.0.1 netmask 0xffffff00 broadcast 10.10.0.255
groups: tun openvpn
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
Opened by PID 65678
Pictorial view also attached for easier readability.
Just to be sure, internet hosts also don't work or only access to 192.168.0.58?
You may remove sensitive data from the output by the way, for debugging we don't need mac addresses and exact netblocks.
@AdSchellevis, thank you. I'll remove MAC addresses and such shortly.
Yes, access to internet (NAT) isn't a problem. The issue is only for traffic which originates from OpnSense e.g. syslog to a syslog server on lan, or Telegraf metrics destined for influxdb on lan, CrowdSec http notification to a LAPI on lan, simple curl requests to anywhere on any lan etc.
I wouldn't suspect the firewall to be honest, my next step would be to capture the traffic on both ends (packet capture on the firewall and on the target for the traffic between both hosts).
These type of issues usually relate to wrong gateways on the client in which case traffic does not return to the expected host.
Thank you for your continued guidance @AdSchellevis . Initially, I too had the suspicion that it must have been me (by virtue of network configuration changes) which would've caused this issue. But, few aspects don't support that conclusion:
- Literally the only change was Opnsense 23.10-->24.4. Opnsense originated traffic towards VLAN was fine up to 23.x.
- Looking at the
tcpdumpcaptures at Opnsense interface (opt6) from initial message, one can see that opt6 (same goes for other opt interfaces) does receive a SYN-ACK in response to its SYN. But, application logic (i.e. something above the TCP/IPv4 stack in opnsense) does not receive delivery of the SYN-ACKs. This causes TCP/IPv4 stack on opnsense to repeat SYN. Re-pasting for brevity
Servers
vlan0.100 2024-06-28
07:37:50.442037 f4:90:ea:00:9f:72 00:50:56:82:d8:b4 ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.1.31315 > 192.168.100.21.8080: Flags [S], cksum 0x8070 (correct), seq 445912424, win 65535, options [mss 8960,nop,wscale 12,sackOK,TS val 1292126707 ecr 0], length 0
Servers
vlan0.100 2024-06-28
07:37:50.442400 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xe967 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838080763 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:51.442697 f4:90:ea:00:9f:72 00:50:56:82:d8:b4 ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.1.31315 > 192.168.100.21.8080: Flags [S], cksum 0x7c87 (correct), seq 445912424, win 65535, options [mss 8960,nop,wscale 12,sackOK,TS val 1292127708 ecr 0], length 0
Servers
vlan0.100 2024-06-28
07:37:51.443231 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xe57e (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838081764 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:52.462713 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xe182 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838082784 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:53.642675 f4:90:ea:00:9f:72 00:50:56:82:d8:b4 ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.1.31315 > 192.168.100.21.8080: Flags [S], cksum 0x73ef (correct), seq 445912424, win 65535, options [mss 8960,nop,wscale 12,sackOK,TS val 1292129908 ecr 0], length 0
Servers
vlan0.100 2024-06-28
07:37:53.643161 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xdce6 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838083964 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:55.662758 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xd502 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838085984 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:37:57.842474 f4:90:ea:00:9f:72 00:50:56:82:d8:b4 ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.1.31315 > 192.168.100.21.8080: Flags [S], cksum 0x6387 (correct), seq 445912424, win 65535, options [mss 8960,nop,wscale 12,sackOK,TS val 1292134108 ecr 0], length 0
Servers
vlan0.100 2024-06-28
07:37:57.842885 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xcc7e (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838088164 ecr 1292126707,nop,wscale 9], length 0
Servers
vlan0.100 2024-06-28
07:38:01.966765 00:50:56:82:d8:b4 f4:90:ea:00:9f:72 ethertype IPv4 (0x0800), length 74: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
192.168.100.21.8080 > 192.168.100.1.31315: Flags [S.], cksum 0xbc62 (correct), seq 3873949677, ack 445912425, win 43440, options [mss 1460,sackOK,TS val 3838092288 ecr 1292126707,nop,wscale 9], length 0
- Both
curlandncexhibit really odd behavior. Latter recognizes tcp/443 as udp/443. - Intermittent allowance of connection establishment for FW-originated traffic from FW-->VLAN host(s) tells me its something to do w/ pf and/or rules.
#nc -4znvw 10 192.168.0.58 443
Connection to 192.168.0.58 443 port [tcp/*] succeeded!
<!-- immediately following which another series of requests fail -->
...
#nc -4znvw 10 192.168.0.58 443
nc: connect to 192.168.0.58 port 443 (tcp) failed: Operation timed out
# nc -4znvw 10 192.168.0.58 443
nc: connect to 192.168.0.58 port 443 (tcp) failed: Operation timed out
- If gateways were incorrectly configured on client(s) (pretty much all of the 150+ hosts in the network) then reachability issues would've manifested one way or another. Packet drop stats on those hosts are 0 when other LAN hosts / clients are involved.
- To match /2/, attached is
tcpdumpfrom the peer (192.168.100.21:8080). I picked a different host (than.100.6) in VLAN-100 on a non-standard port to illustrate the point further. As can be seen, the host keeps getting re-transmissions of SYN from FW (caused as a result ofcurl -vi --connect-timeout 5 http://192.168.100.21:8080. At the FW interface, NIC receives the packet, but doesn't respond with ACK. from_host.zip
I'm a bit out of clues I'm afraid, I suspect there is a logical explanation, but tracking this exceeds my current community support time.
This issue has been automatically timed-out (after 180 days of inactivity).
For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.
If someone wants to step up and work on this issue, just let us know, so we can reopen the issue and assign an owner to it.