xcp icon indicating copy to clipboard operation
xcp copied to clipboard

kernel-alt works, main kernel doesn't -- jumbo frames on Broadcom NetXtreme II BCM5709

Open axctal opened this issue 4 years ago • 12 comments

Version : XCP-ng 8.2.0

Hardware (one of the ports):

01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Subsystem: Dell PowerEdge R710 BCM5709 Gigabit Ethernet Kernel driver in use: bnx2 Kernel modules: bnx2

Three physical interfaces are bonded as LACP with MTU = 9000 Various interfaces ("networks") created on top of the bond with MTU = 1500 as well as 9000

With the main kernel : -- All networks can only send out packets not exceeding MTU=1500 -- ping test with 'no defrag' and jumbo size: FAIL -- All relevant objects intended for jumbo frames - VIFs, bond, and even physical interfaces ( eth# ) are reporting correct MTU=9000 if invoked with the 'xe ...' commands -- However, the 'ifconfig ethX' is reporting MTU=1500 -- Attempt to set MTU on physical interface errors out : SIOCSIFMTU: Invalid argument

-- system was fully updated ( 'yum upgrade' ) and rebooted [ 2021 / 04 / 01 ] before re-testing --> problem remained

Alt kernel :

-- did 'yum install kernel-alt' and rebooted into it -- upon booting (with alt kernel), the 'ifconfig ethX' is now reporting MTU=9000 (no extra action was needed) -- ping test with 'no defrag' and jumbo size: SUCCESS

uname -a Linux r710-01 4.19.142 #1 SMP Tue Nov 3 11:27:36 CET 2020 x86_64 x86_64 x86_64 GNU/Linux

axctal avatar Apr 02 '21 05:04 axctal

Thanks for the report @axctal

I have the feeling it's related to the bnx2 driver version between 2 kernels. @stormi what do you think?

olivierlambert avatar Apr 02 '21 08:04 olivierlambert

The alt kernel will use built-in drivers whereas the main kernel uses out of tree vendor drivers, that sometimes lag behind in terms of bug fixing, or have specific bugs. But they're the ones officially supported by the vendors :(.

Could you try the main kernel + our alternate qlogic-netxtreme2-alt driver package? This would allow to verify whether the issue is fixed in recent vendor drivers.

stormi avatar Apr 02 '21 15:04 stormi

  1. Rebooted back into main kernel ... interfaces eth1..3 which should have MTU=9000 are now 1500, just as before i.e. the problem is re-instated

  2. Installed alt driver yum install qlogic-netxtreme2-alt ... success

  3. Rebooted ... interfaces eth1..3 are still MTU=1500

ifconfig eth1 mtu 9000 SIOCSIFMTU: Invalid argument

Conclusion: Problem does exist with main kernel + qlogic-netxtreme2-alt

  1. removed 'qlogic-netxtreme2-alt' and rebooted back to alt-kernel eth1..3 do have MTU=9000

axctal avatar Apr 04 '21 03:04 axctal

Just providing another data point. I'm in the same boat -- same NIC (Broadcom NetXtreme II BCM5709S rev 20) and following an upgrade from XS 7 to XCP-NG 8.2 (kernel 4.19.0+1) my storage NICs that were set to MTU 9000 were down. I see "SIOCSIFMTU: Invalid argument" when trying to set the MTU to 9000.

I first tried the alt-driver with no luck. I then tried the alt-kernel (4.19.142) and it works perfectly.

@stormi I'm happy to test alternate configurations if/when they become available.

JamuelStarkey avatar May 02 '21 23:05 JamuelStarkey

We also have a pool exhibiting this. Using kernel-alt fixed it for us as well. Linux jib 4.19.154 #1 SMP Fri Apr 2 11:10:13 CEST 2021 x86_64 x86_64 x86_64 GNU/Linux

niklasha avatar Feb 10 '22 21:02 niklasha

Given what I've seen of vendor drivers from qlogic/broadcom, I'm not entirely surprised that the built-in drivers from kernel.org fix bugs that the official drivers from the vendor don't.

You may try this: boot the main kernel, remove the qlogic-netxtreme2-alt RPM if installed, then do something dirty: rename /usr/lib/modules/4.19.0+1/updates/bnx2.ko to something else, run depmod -a and then regenerate the initrd with dracut -f /boot/initrd-4.19.0+1.img 4.19.0+1. This will make you use the built-in driver rather than the vendor driver.

If this solves your issue, we'll see how to do this in a cleaner way.

(anyway, do you really need jumbo frames? I think I always hear @Fohdeesha saying it's an optimization from the past that brings little to no benefits nowadays...)

stormi avatar Feb 10 '22 23:02 stormi

We'll see if we can get to try your proposal today.

W.r.t. jumbo-frames, that might be so depending on the age of the equipment. We still run somewhat aged equipment here. Initial tests of mine show that per-packet overhead is significantly more important that per-byte overhead. with just straight flood pings (yes I know this is not akin to e.g. iSCSI traffic, since it is symmetric serialized back-to-back communication) shows a bandwidth increase of 360% with jumbo-frames, all other things equal, telling me that the interrupt overhead combined with per-packet overhead is still significant. I wouldn't go back to 1500 MTU without hard evidence that that overhead has become close to zero. Maybe in a couple of years when our machine park is newer. I would be very cautious with such advise if not all kind of traffic did not benefit from it. Just seeing flood pings lose severely when using small MTU is enough for me to not even try dropping jumbo frames in my hw config. YMMV

niklasha avatar Feb 11 '22 06:02 niklasha

OK, got to try your proposed solution on the last host in the pool we were upgrading. Unfortunately, it did not help. We will be running with the alt-kernel in that pool for now. We can likely test other solutions if need be, if only we get some time to plan it. This was during a maintenance slot, so it was fine now.

niklasha avatar Feb 11 '22 09:02 niklasha

Hardware: Dell R410, NetXtreme II BCM5716 Gigabit Ethernet

Version: XCP-ng version 8.2.0

I've run into the same issue.

Linux xcp-ng 4.19.0+1 #1 SMP Thu Jan 13 12:55:45 CET 2022 x86_64 x86_64 x86_64 GNU/Linux
------------------------------------------------------------------------
01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM5716 Gigabit Ethernet (rev 20)         
        Subsystem: Dell PowerEdge R410 BCM5716 Gigabit Ethernet                                                    
        Flags: bus master, fast devsel, latency 0, IRQ 36                                                          
        Memory at da000000 (64-bit, non-prefetchable) [size=32M]                                                   
        Capabilities: [48] Power Management version 3                                                                      Capabilities: [50] Vital Product Data                                                                      
        Capabilities: [58] MSI: Enable- Count=1/16 Maskable- 64bit+                                                
        Capabilities: [a0] MSI-X: Enable+ Count=9 Masked-                                                          
        Capabilities: [ac] Express Endpoint, MSI 00
------------------------------------------------------------------------
modinfo bnx2
filename:       /lib/modules/4.19.0+1/updates/bnx2.ko
version:        2.2.5w
license:        GPL
description:    QLogic BCM5706/5708/5709/5716 Driver
author:         Michael Chan <[email protected]>
srcversion:     9039DABA84095A38F3E42C3
alias:          pci:v000014E4d0000163Csv*sd*bc*sc*i*
alias:          pci:v000014E4d0000163Bsv*sd*bc*sc*i*
alias:          pci:v000014E4d0000163Asv*sd*bc*sc*i*
alias:          pci:v000014E4d00001639sv*sd*bc*sc*i*
alias:          pci:v000014E4d000016ACsv*sd*bc*sc*i*
alias:          pci:v000014E4d000016AAsv*sd*bc*sc*i*
alias:          pci:v000014E4d000016AAsv0000103Csd00003102bc*sc*i*
alias:          pci:v000014E4d0000164Csv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Asv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Asv0000103Csd00003106bc*sc*i*
alias:          pci:v000014E4d0000164Asv0000103Csd00003101bc*sc*i*
depends:        
retpoline:      Y
name:           bnx2
vermagic:       4.19.0+1 SMP mod_unload modversions 
parm:           disable_msi:Disable Message Signaled Interrupt (MSI) (int)
parm:           stop_on_tx_timeout:For debugging purposes, prevent a chip  reset when a tx timeout occurs (int)
                                        
        Capabilities: [100] Device Serial Number a4-ba-db-ff-fe-09-97-e4                                                   Capabilities: [110] Advanced Error Reporting                                                               
        Capabilities: [150] Power Budgeting <?>                                                                    
        Capabilities: [160] Virtual Channel                                                                        
        Kernel driver in use: bnx2                                                                                 
        Kernel modules: bnx2

On the other box with same hardware it works like a charm (Ubuntu 20.04 LTS):

Linux r410 5.13.0-30-generic #33~20.04.1-Ubuntu SMP Mon Feb 7 14:25:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
----------------------------------------------------------------------------------------------
kacper@r410:~$ sudo lshw -c network                                                                                  *-network:0                                            
       description: Ethernet interface
       product: NetXtreme II BCM5716 Gigabit Ethernet
       vendor: Broadcom Inc. and subsidiaries
       physical id: 0
       bus info: pci@0000:01:00.0
       logical name: eno1
       version: 20
       serial: 78:2b:cb:36:64:20
       size: 1Gbit/s
       capacity: 1Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pm vpd msi msix pciexpress bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-f
d 1000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=bnx2 driverversion=5.13.0-30-generic duplex=full firm
ware=6.2.12 bc 5.2.3 NCSI 2.0.11 latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s
       resources: irq:28 memory:da000000-dbffffff
--------------------------------------------------------------------------------------------
kacper@r410:~$ modinfo bnx2                                                                                [2/3422]
filename:       /lib/modules/5.13.0-30-generic/kernel/drivers/net/ethernet/broadcom/bnx2.ko                        
firmware:       bnx2/bnx2-rv2p-09ax-6.0.17.fw                                                                      
firmware:       bnx2/bnx2-rv2p-09-6.0.17.fw                                                                        
firmware:       bnx2/bnx2-mips-09-6.2.1b.fw
firmware:       bnx2/bnx2-rv2p-06-6.0.15.fw
firmware:       bnx2/bnx2-mips-06-6.2.3.fw
license:        GPL
description:    QLogic BCM5706/5708/5709/5716 Driver
author:         Michael Chan <[email protected]>
srcversion:     574D45BFC01A88EB7EF0DFC
alias:          pci:v000014E4d0000163Csv*sd*bc*sc*i*
alias:          pci:v000014E4d0000163Bsv*sd*bc*sc*i*
alias:          pci:v000014E4d0000163Asv*sd*bc*sc*i*
alias:          pci:v000014E4d00001639sv*sd*bc*sc*i*
alias:          pci:v000014E4d000016ACsv*sd*bc*sc*i*
alias:          pci:v000014E4d000016AAsv*sd*bc*sc*i*
alias:          pci:v000014E4d000016AAsv0000103Csd00003102bc*sc*i*
alias:          pci:v000014E4d0000164Csv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Asv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Asv0000103Csd00003106bc*sc*i*
alias:          pci:v000014E4d0000164Asv0000103Csd00003101bc*sc*i*
depends:        
retpoline:      Y
intree:         Y
name:           bnx2
vermagic:       5.13.0-30-generic SMP mod_unload modversions 
sig_id:         PKCS#7
signer:         Build time autogenerated kernel key
sig_key:        31:47:00:F2:EF:16:00:04:61:34:F3:14:4B:CA:58:1D:A2:73:04:B5
sig_hashalgo:   sha512
signature:      7E:80:EA:0F:EA:F9:0F:24:ED:75:F2:93:20:9B:F3:0D:40:C1:16:3E:
                3C:95:00:60:15:E9:53:16:90:02:C6:D2:AB:46:0B:62:D1:2C:E3:1E:
                29:4D:B4:00:C4:DD:F8:DE:51:98:D0:12:2C:CD:99:58:66:47:DD:C1:
                25:BB:A6:49:B1:39:D5:E9:89:EB:FB:81:EF:52:E7:0C:59:31:3E:84:
                5D:5F:6A:52:B6:D7:97:AD:63:49:BA:4C:15:63:1F:DE:F8:C2:CF:AE:
                24:6B:6B:90:9B:9B:DE:E8:1A:75:92:ED:1E:85:38:71:BF:F9:BA:3B:
                69:DA:E6:25:E6:49:0F:25:19:CC:7E:62:75:6D:2C:6E:DD:16:4E:B1:
                06:C1:79:CF:4D:9C:6C:A2:8F:81:52:C2:3B:B3:68:6E:25:16:87:7B:
                59:BC:34:A8:E4:1B:94:35:70:06:CB:94:0D:DB:42:C1:A6:C4:C0:1A:
                85:B5:B6:87:60:AE:9D:BE:94:E6:9E:20:CE:57:B2:B2:46:0A:25:65:
                2C:08:92:AE:6A:87:1D:B0:AC:F4:05:4F:42:B7:B5:B6:93:D9:B5:46:
                6B:0D:28:D6:59:10:EB:03:FC:17:A8:6A:1A:E7:0C:2A:14:36:90:5B:
                CB:C5:F7:10:23:21:08:82:81:B1:90:59:B7:C3:A1:DB:AF:91:93:79:
                D6:F3:E3:2C:AF:12:71:C3:9A:74:BD:D2:95:2A:21:2C:B5:E5:EF:58:
                59:70:55:9C:92:8A:B0:97:91:B8:72:02:F4:36:17:27:F4:AD:D6:39:
                12:8C:3B:50:27:30:A0:29:C5:50:A0:EF:27:62:4F:78:1B:5E:B9:5F:
                D7:B4:65:12:4B:AF:41:25:32:64:D6:D9:1C:B4:18:05:4C:91:ED:D2:
                5E:73:16:A7:DC:CB:FC:BC:E3:7A:40:71:9B:97:A5:C3:05:0B:55:6C:
                B4:1D:91:13:7C:E9:5B:02:63:56:A6:40:70:C2:69:16:19:BE:A9:22:
                7D:5A:B9:2B:A5:C8:45:0E:61:5E:05:26:0A:BA:42:0F:83:DD:41:3E:
                A6:8E:DF:F2:AF:A0:04:C2:3A:6A:8B:3A:A1:EE:00:AC:03:B2:2C:2A:
                05:CA:03:31:84:24:E4:D8:F0:C8:EB:F6:DF:77:77:ED:24:61:D6:FD:
                49:49:91:36:C0:5C:65:91:0E:27:44:0F:67:21:C1:D8:9A:FE:92:51:
                C3:1A:0D:9E:AA:9D:94:EA:4B:D4:ED:2D:41:63:11:35:96:D1:48:72:
                BD:2A:DB:56:AB:38:3E:4F:FB:24:C2:41:60:BD:67:D2:E7:81:24:0F:
                EB:4B:32:3F:0C:FF:29:2F:3A:5A:64:52
parm:           disable_msi:Disable Message Signaled Interrupt (MSI) (int)

DoctorKM avatar Feb 20 '22 22:02 DoctorKM

Removing the "updated" vendor drivers on the standard 8.2/8.3 install and using the kernel drivers resolved the issue for me.

# yum remove qlogic-netxtreme2-alt
# rpm -e --nodeps qlogic-netxtreme2
# rpm -e --nodeps qlogic-netxtreme2-4.19.0+1-modules

andrew64k avatar Sep 21 '23 04:09 andrew64k

It's not the first time we've had issues with this specific vendor driver. Thanks for the this data point.

However I would like to avoid such comments being seen as an invitation for other users to rpm -e --nodeps stuff in dom0. Could you use a configuration file in /etc/modprobe.d instead to either blacklist the bad driver, or alias bnx2 to the absolute path of the built-in module?

stormi avatar Sep 21 '23 08:09 stormi

I habe a HP DL360 G7 with the same problem.

I used @stormis infos with actual Xenserver:


mv /usr/lib/modules/4.19.0+1/updates/bnx2.ko /usr/lib/modules/4.19.0+1/updates/bnx2.ko.old
depmod -a
dracut -f /boot/initrd-4.19.0+1.img 4.19.0+1
reboot

After this I can use MTU 9000

Test012345678949 avatar Jan 11 '24 22:01 Test012345678949