docker-sriov-plugin icon indicating copy to clipboard operation
docker-sriov-plugin copied to clipboard

NetworkDriver.CreateNetwork: SRIOV is disabled [ ibo1 ]

Open hasechris opened this issue 4 years ago • 2 comments

Hi,

i wantedt to test this repo, but i keep getting this error when creating the sriov docker network:

root@naproxen:~# docker network create -d sriov --subnet=10.111.111.0/24 -o netdevice=ibo1 -o mode=sriov mynet
Error response from daemon: NetworkDriver.CreateNetwork: SRIOV is disabled [ ibo1 ].

Here my Host config:

root@naproxen:~# mlxconfig -d /dev/mst/mt4103_pci_cr0 query

Device #1:
----------

Device type:    ConnectX3Pro    
Device:         /dev/mst/mt4103_pci_cr0

Configurations:                              Next Boot
         SRIOV_EN                            True(1)         
         NUM_OF_VFS                          15              
         WOL_MAGIC_EN_P2                     True(1)         
         LINK_TYPE_P1                        IB(1)           
         LINK_TYPE_P2                        IB(1)           
         LOG_BAR_SIZE                        5               
         BOOT_PKEY_P1                        0               
         BOOT_PKEY_P2                        0               
         BOOT_OPTION_ROM_EN_P1               True(1)         
         BOOT_VLAN_EN_P1                     False(0)        
         BOOT_RETRY_CNT_P1                   0               
         LEGACY_BOOT_PROTOCOL_P1             PXE(1)          
         BOOT_VLAN_P1                        1               
         BOOT_OPTION_ROM_EN_P2               True(1)         
         BOOT_VLAN_EN_P2                     False(0)        
         BOOT_RETRY_CNT_P2                   0               
         LEGACY_BOOT_PROTOCOL_P2             PXE(1)          
         BOOT_VLAN_P2                        1               
root@naproxen:~# 
root@naproxen:~# 
root@naproxen:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens2f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether 00:15:17:8e:a1:54 brd ff:ff:ff:ff:ff:ff
3: ens2f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:15:17:8e:a1:55 brd ff:ff:ff:ff:ff:ff
    inet 10.42.22.71/22 brd 10.42.23.255 scope global dynamic ens2f1
       valid_lft 83690sec preferred_lft 83690sec
    inet6 fe80::215:17ff:fe8e:a155/64 scope link 
       valid_lft forever preferred_lft forever
6: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none 
    inet 10.254.0.20/24 scope global wg0
       valid_lft forever preferred_lft forever
7: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:91:1e:aa:54 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
10: ibo1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP group default qlen 256
    link/infiniband 80:00:02:08:fe:80:00:00:00:00:00:00:50:65:f3:ff:ff:89:05:71 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet6 fe80::5265:f3ff:ff89:571/64 scope link 
       valid_lft forever preferred_lft forever
11: ibo1d1: <BROADCAST,MULTICAST> mtu 4092 qdisc noop state DOWN group default qlen 256
    link/infiniband 80:00:02:09:fe:80:00:00:00:00:00:00:50:65:f3:ff:ff:89:05:72 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
root@naproxen:~# ibstat
CA 'mlx4_0'
	CA type: MT4103
	Number of ports: 2
	Firmware version: 2.35.5100
	Hardware version: 0
	Node GUID: 0x5065f3ffff890570
	System image GUID: 0x5065f3ffff890573
	Port 1:
		State: Active
		Physical state: LinkUp
		Rate: 10
		Base lid: 3
		LMC: 0
		SM lid: 2
		Capability mask: 0x02514868
		Port GUID: 0x5065f3ffff890571
		Link layer: InfiniBand
	Port 2:
		State: Down
		Physical state: Polling
		Rate: 10
		Base lid: 0
		LMC: 0
		SM lid: 0
		Capability mask: 0x02514868
		Port GUID: 0x5065f3ffff890572
		Link layer: InfiniBand
root@naproxen:~# ibstatus
Infiniband device 'mlx4_0' port 1 status:
	default gid:	 fe80:0000:0000:0000:5065:f3ff:ff89:0571
	base lid:	 0x3
	sm lid:		 0x2
	state:		 4: ACTIVE
	phys state:	 5: LinkUp
	rate:		 10 Gb/sec (4X SDR)
	link_layer:	 InfiniBand

Infiniband device 'mlx4_0' port 2 status:
	default gid:	 fe80:0000:0000:0000:5065:f3ff:ff89:0572
	base lid:	 0x0
	sm lid:		 0x0
	state:		 1: DOWN
	phys state:	 2: Polling
	rate:		 10 Gb/sec (4X SDR)
	link_layer:	 InfiniBand

root@naproxen:~# 

mlx4_core modprobe:

Jul 02 02:37:56 naproxen kernel: mlx4_core: Mellanox ConnectX core driver v4.0-0
Jul 02 02:37:56 naproxen kernel: mlx4_core: Initializing 0000:03:00.0
Jul 02 02:38:01 naproxen kernel: mlx4_core 0000:03:00.0: DMFS high rate steer mode is: disabled performance optimized steering
Jul 02 02:38:01 naproxen kernel: mlx4_core 0000:03:00.0: 63.008 Gb/s available PCIe bandwidth (8 GT/s x8 link)
Jul 02 02:38:01 naproxen kernel: <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v4.0-0
Jul 02 02:38:01 naproxen kernel: <mlx4_ib> mlx4_ib_add: counter index 0 for port 1 allocated 0
Jul 02 02:38:01 naproxen kernel: <mlx4_ib> mlx4_ib_add: counter index 1 for port 2 allocated 0
Jul 02 02:38:01 naproxen systemd-udevd[9272]: Using default interface naming scheme 'v240'.
Jul 02 02:38:01 naproxen systemd-udevd[9267]: Using default interface naming scheme 'v240'.
Jul 02 02:38:01 naproxen systemd-udevd[9272]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jul 02 02:38:01 naproxen kernel: mlx4_core 0000:03:00.0 ibo1: renamed from ib0
Jul 02 02:38:01 naproxen systemd-udevd[9267]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jul 02 02:38:01 naproxen kernel: mlx4_core 0000:03:00.0 ibo1d1: renamed from ib1

OS (with proxmox 6.2 on top):

root@naproxen:~# cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
root@naproxen:~# 

hasechris avatar Jul 02 '20 00:07 hasechris

@paravmellanox

hasechris avatar Jul 02 '20 00:07 hasechris

I found the solution:

/etc/modprobe.d/mlx4.conf

options mlx4_core debug_level=1 port_type_array=1,1 num_vfs=2 probe_vf=1 log_num_mgm_entry_size=-1

The key part is this here: ''probe_vf=1'' The os needs to see the Virtual Functions in order to use it for docker.

hasechris avatar Jul 02 '20 19:07 hasechris