k8s-rdma-shared-dev-plugin icon indicating copy to clipboard operation
k8s-rdma-shared-dev-plugin copied to clipboard

Unable to find the Infiniband/RoCE device when using the Ubuntu 18.04 image

Open thincal opened this issue 1 year ago • 6 comments
trafficstars

Created a container with the ubuntu 18.04 image and using the rdma-shared device plugin, inside the container when running ib_write_bw it reports bellow error, but with ubuntu 20.04/22.04 it works well. so what's the reason behind causing this issue ? appreciated for any information.

Did not detect devices 
If device exists, check if driver is up
Unable to find the Infiniband/RoCE device

thincal avatar Apr 25 '24 08:04 thincal

@adrianchiris Hi, do you have any info with this issue ? thanks.

thincal avatar Apr 30 '24 14:04 thincal

i believe its related to the perftest version being used in the workload container vs the RDMA api exposed by the kernel running on the node.

what is the OS of the k8s worker node ?

i dont think its related to rdma shared device plugin

adrianchiris avatar May 01 '24 13:05 adrianchiris

what is the OS of the k8s worker node ?

Ubuntu 22.04.2 LTS

thincal avatar May 02 '24 05:05 thincal

i believe its related to the perftest version being used in the workload container vs the RDMA api exposed by the kernel running on the node.

so that is the reason.

adrianchiris avatar Jun 24 '24 06:06 adrianchiris