ib-kubernetes
ib-kubernetes copied to clipboard
how should I use UFM??
should I run it in docker, in every worker node?? and ib-kubernetes should run every node and communicate ufm docker container?
I’m confusing because deployment yaml is just deployment and ufm is running on docker not kubernetes even though this repo is running on kubernetes.
Any update?
Considering "docker exec -it ufm bash", ufm must be a docker image. Where is ufm image located? Is it required to use ufm Plugin to use IB? Does NOOP Plugin give us IB access for inter-node communication?
I'm confused too. Is there anyone tell us how config the secret named ib-kubernetes-ufm-secret
. I just get error as follows
[root@dev ~]# k logs ib-kubernetes-86984cfc5-zbt5r
2023-08-08T08:25:56Z INF Starting InfiniBand Daemon
2023-08-08T08:25:56Z INF creating guid pool, guidRangeStart 02:00:00:00:00:00:00:00, guidRangeEnd 02:FF:FF:FF:FF:FF:FF:FF
2023-08-08T08:25:56Z INF loading plugin from path /plugins/ufm.so, symbolName Initialize
2023-08-08T08:25:56Z INF Initializing ufm plugin
2023-08-08T08:25:56Z ERR failed to create daemon: missing one or more required fileds for ufm ["username", "password", "address"]
How should I config the ufm
Hey,
Its in the README of this project. please see: https://github.com/Mellanox/ib-kubernetes#plugin-configuration
you need to have UFM deployed on some node. this project does not cover how to deploy UFM. it has its own documentation as its a licensed product.
regarding NOOP plugin, its used when interactions with an IB subnet manager is not required but you still want to have GUID allocation done by ib-kubernetes.
@adrianchiris Hey, bro. If I use NOOP plugin, is there any other deployment action? At the same time, use the noop plugin to configure the virtual function IB, is there anything to pay attention to?
you need to set ibKubernetesEnabled
in your secondary network attachment definition if you are using ib-sriov-cni.
see: https://github.com/k8snetworkplumbingwg/ib-sriov-cni#configuration-reference
@adrianchiris Ok,I got it. but how should I configure the annotations with the pod. This is the test pod as follow
apiVersion: v1
kind: Pod
metadata:
name: sriov-ib
namespace: ib
annotations:
k8s.v1.cni.cncf.io/networks: ib@ib
mellanox.infiniband.app: ib
spec:
nodeName: sg-dcu002.hogpu.cc
containers:
- name: cni
image: mellanox/rping-test
imagePullPolicy: IfNotPresent
command:
- sleep
- 3600s
resources:
requests:
mellanox.com/sriov_ib_rdma: '1'
limits:
mellanox.com/sriov_ib_rdma: '1'
But I get some error
infiniBand SRI-OV CNI failed to configure VF "failed to add node guid 02:11:22:33:44:55:02:02: operation not supported"
where should I configure the annotation, and what content should I apply