k8s-rdma-sriov-dev-plugin icon indicating copy to clipboard operation
k8s-rdma-sriov-dev-plugin copied to clipboard

Driver doesn't support SRIOV configuration via sysfs

Open LucaPrete opened this issue 6 years ago • 7 comments

We have a Mellanox ConnectX-3 dual port nic.

We're following this guide: https://community.mellanox.com/s/article/reference-deployment-guide-for-k8s-cluster-with-mellanox-rdma-device-plugin-and-multus-cni-plugin-with-two-network-interfaces--flannel-and-mellanox-sr-iov---draft-x

Everything runs smoothly, until I activate the device plugin. The plugin installs fine, but when I look at the logs I see

/sys/class/net/eth2/device/sriov_numvfs: Function not implemented

I did manually what I think the plugin does on each physical node running k8s. For example,

echo 8 | sudo tee /sys/class/net/eth2/device/sriov_numvfs

I always get the same error:

/sys/class/net/eth2/device/sriov_numvfs: Function not implemented

Also, if I do

dmesg | grep -i mlx

I see this error:

mlx4_core 0000:03:00.0: Driver doesn't support SRIOV configuration via sysfs.

As an experiment, I've also tried to activate VFs through the mlx_core driver configuration (as for example described here: https://community.mellanox.com/s/article/howto-configure-sr-iov-for-connectx-3-with-kvm--ethernet-x). In this case VFs come up and everything works fine, but unfortunately this doesn't seem to be compatible with the SRIOV device plugin.

This is really blocking us..any suggestion would be highly appreciated!

Thanks.

LucaPrete avatar Jan 12 '19 19:01 LucaPrete

Hi @LucaPrete ,

ConnectX3 are not supported by the plugin. I recommend you to upgrade to ConnectX4 or 5. They bring lot of features that will be useful for rdma and nic.

paravmellanox avatar Jan 12 '19 23:01 paravmellanox

Thank you @paravmellanox ! Unfortunately all our servers run the ConnectX3. I'm wondering if the plugin could be modified so it doesn't configure at system level the VFs, but just uses the ones configured through the driver...what do you think? If we achieve this, what other limitations do you see?

LucaPrete avatar Jan 14 '19 16:01 LucaPrete

Hi @LucaPrete what functionality do you plan to run on ConnectX3? rdma, ethernet, dpdk, or part of it?

paravmellanox avatar Jan 14 '19 17:01 paravmellanox

@paravmellanox I've seen the RDMA term in other places as well, but I'm not very familiar with it. I have to do some homework here :) For now, what we have to achieve is to realize a PoC where k8s containers can have a second SR-IOV NIC. The NIC is then connected to a custom fabric. DPDK support may be nice as a next step, but not mandatory for the first one.

LucaPrete avatar Jan 14 '19 17:01 LucaPrete

@LucaPrete, Connectx3 are pretty old now. Can you please talk to customer support as you need DPDK support in next step? We don't have strong plan to support DPDK mode. Most users have upgraded to ConnectX4/5 so...

paravmellanox avatar Jan 14 '19 17:01 paravmellanox

I understand...we'll definitely put in the plans to use another NIC. I was wondering if in the meantime we could anyway use the ConnectX-3 without DPDK support (for the PoC).

Il giorno lun 14 gen 2019 alle ore 09:24 Parav Pandit < [email protected]> ha scritto:

@LucaPrete https://github.com/LucaPrete, Connectx3 are pretty old now. Can you please talk to customer support as you need DPDK support in next step? We don't have strong plan to support DPDK mode. Most users have upgraded to ConnectX4/5 so...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Mellanox/k8s-rdma-sriov-dev-plugin/issues/19#issuecomment-454088328, or mute the thread https://github.com/notifications/unsubscribe-auth/ACOP0ALHTCCcEvXv89BUf3ODT3YdIrSkks5vDL09gaJpZM4Z8zh3 .

LucaPrete avatar Jan 14 '19 17:01 LucaPrete

@LucaPrete for PoC is fine to use ConnectX3. It requires some work, so some hacks should be done for PoC.

paravmellanox avatar Jan 14 '19 17:01 paravmellanox