src icon indicating copy to clipboard operation
src copied to clipboard

sfxge driver causes a kernel panic during network startup

Open tusc opened this issue 2 years ago • 0 comments

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

  • [ x] I have read the contributing guide lines at https://github.com/opnsense/src/blob/master/CONTRIBUTING.md
  • [x ] I am convinced that my issue is new after having checked both open and closed issues at https://github.com/opnsense/src/issues?q=is%3Aissue

Describe the bug

For some reason, a SolarFlare SFN7122F 10Gb dual port card does not work on Opnsense 23.1 when installed in an HP 290-p0043w SFF computer (Celeron G4900 CPU) w/ 4Gb RAM. This can be recreated whenever I assign network interfaces from the cli. After confirming my settings the system kernel panics after the network starts. This card works fine on the same computer using Linux (Sophos UTM).

The card also works on Opnsense on a different machine, an HP T620 Plus thin client with an AMD GX-420CA APU SoC w/ 8Gb RAM.

To Reproduce

Steps to reproduce the behavior:

  1. From cli, select option 1, assign interfaces
  2. confirm choices
  3. wait for a kernel panic
  4. See error

Expected behavior

No panic.

Describe alternatives you considered

SolarFlare card works fine on another computer with double the ram (4GB vs 8GB). Other than memory the only difference is CPU (Celeron G4900 vs AMD GX-420CA APU SoC)

Screenshots

<118>Mon Jan 30 21:41:22 UTC 2023
<118>
<118>*** OPNsense.localdomain: OPNsense 23.1 ***
<118>
<118> LAN (re0)       -> v4: 192.168.1.1/24
<118>
<118> HTTPS: SHA256 9A D0 FE FA 39 E9 29 BB FE DF 6D 62 B6 0D FB 3E
<118>               CF 03 DE DE 6C A3 14 39 6B 72 30 33 70 BD 95 7A
sfxge0: <Solarflare SFC9100 family> port 0x4100-0x41ff mem 0xa1800000-0xa1ffffff,0xa2084000-0xa2087fff irq 16 at device 0.0 on pci1
sfxge0: Using MSI-X interrupts
<6>sfxge0: Ethernet address: 00:0f:53:3c:b4:a0
sfxge0: Solarflare Flareon Ultra 7000 Series 10G Adapter
sfxge1: <Solarflare SFC9100 family> port 0x4000-0x40ff mem 0xa1000000-0xa17fffff,0xa2080000-0xa2083fff irq 17 at device 0.1 on pci1
sfxge1: Using MSI-X interrupts
<6>sfxge1: Ethernet address: 00:0f:53:3c:b4:a1
sfxge1: Solarflare Flareon Ultra 7000 Series 10G Adapter
<6>sfxge0: link state changed to DOWN
<6>sfxge1: link state changed to DOWN
<6>sfxge1: link state changed to UP
<6>sfxge1: tso4 disabled due to -txcsum
<6>sfxge1: tso6 disabled due to -txcsum6
<6>sfxge0: tso4 disabled due to -txcsum
<6>sfxge0: tso6 disabled due to -txcsum6
<3>arp: 6a:d7:9a:3f:00:ce is using my IP address 192.168.1.1 on sfxge1!

Fatal trap 18: integer divide fault while in kernel mode
cpuid = 1; apic id = 02
instruction pointer	= 0x20:0xffffffff82a3630c
stack pointer	        = 0x28:0xfffffe00aad81720
frame pointer	        = 0x28:0xfffffe00aad81760
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 62539 (dhcp6c)
trap number		= 18
panic: integer divide fault
cpuid = 1
time = 1675114925
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00aad81540
vpanic() at vpanic+0x17f/frame 0xfffffe00aad81590
panic() at panic+0x43/frame 0xfffffe00aad815f0
trap_fatal() at trap_fatal+0x385/frame 0xfffffe00aad81650
calltrap() at calltrap+0x8/frame 0xfffffe00aad81650
--- trap 0x12, rip = 0xffffffff82a3630c, rsp = 0xfffffe00aad81720, rbp = 0xfffffe00aad81760 ---
sfxge_if_transmit() at sfxge_if_transmit+0x4c/frame 0xfffffe00aad81760
ether_output_frame() at ether_output_frame+0xab/frame 0xfffffe00aad81790
ether_output() at ether_output+0x681/frame 0xfffffe00aad81820
ip6_output() at ip6_output+0x1ce5/frame 0xfffffe00aad81a60
udp6_send() at udp6_send+0x851/frame 0xfffffe00aad81c30
sosend_dgram() at sosend_dgram+0x347/frame 0xfffffe00aad81c90
sosend() at sosend+0x50/frame 0xfffffe00aad81cc0
kern_sendit() at kern_sendit+0x1fd/frame 0xfffffe00aad81d60
sendit() at sendit+0x1d7/frame 0xfffffe00aad81db0
sys_sendto() at sys_sendto+0x4d/frame 0xfffffe00aad81e00
amd64_syscall() at amd64_syscall+0x10c/frame 0xfffffe00aad81f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00aad81f30
--- syscall (133, FreeBSD ELF64, sys_sendto), rip = 0x80039057a, rsp = 0x7fffffffd868, rbp = 0x7fffffffde70 ---
KDB: enter: panic

Relevant log files msgbuf.txt ddb.txt

Additional context

Add any other context about the problem here.

Environment

Software version used and hardware type if relevant, e.g.:

OPNsense 23.1 (amd64, OpenSSL). SolarFlare SFN7122F 10Gb dual port card HP 290-p0043w SFF computer (Celeron G4900 CPU) w/ 4Gb RAM

tusc avatar Jan 31 '23 16:01 tusc