ib-kubernetes icon indicating copy to clipboard operation
ib-kubernetes copied to clipboard

how should I use UFM??

Open JuHyung-Son opened this issue 2 years ago • 7 comments

should I run it in docker, in every worker node?? and ib-kubernetes should run every node and communicate ufm docker container?

I’m confusing because deployment yaml is just deployment and ufm is running on docker not kubernetes even though this repo is running on kubernetes.

JuHyung-Son avatar Jul 13 '22 13:07 JuHyung-Son

Any update?

gavahi avatar Mar 06 '23 20:03 gavahi

Considering "docker exec -it ufm bash", ufm must be a docker image. Where is ufm image located? Is it required to use ufm Plugin to use IB? Does NOOP Plugin give us IB access for inter-node communication?

gavahi avatar Mar 06 '23 21:03 gavahi

I'm confused too. Is there anyone tell us how config the secret named ib-kubernetes-ufm-secret. I just get error as follows

[root@dev ~]# k logs ib-kubernetes-86984cfc5-zbt5r
2023-08-08T08:25:56Z INF Starting InfiniBand Daemon
2023-08-08T08:25:56Z INF creating guid pool, guidRangeStart 02:00:00:00:00:00:00:00, guidRangeEnd 02:FF:FF:FF:FF:FF:FF:FF
2023-08-08T08:25:56Z INF loading plugin from path /plugins/ufm.so, symbolName Initialize
2023-08-08T08:25:56Z INF Initializing ufm plugin
2023-08-08T08:25:56Z ERR failed to create daemon: missing one or more required fileds for ufm ["username", "password", "address"]

How should I config the ufm

foursunZero avatar Aug 08 '23 08:08 foursunZero

Hey,

Its in the README of this project. please see: https://github.com/Mellanox/ib-kubernetes#plugin-configuration

you need to have UFM deployed on some node. this project does not cover how to deploy UFM. it has its own documentation as its a licensed product.

regarding NOOP plugin, its used when interactions with an IB subnet manager is not required but you still want to have GUID allocation done by ib-kubernetes.

adrianchiris avatar Aug 08 '23 10:08 adrianchiris

@adrianchiris Hey, bro. If I use NOOP plugin, is there any other deployment action? At the same time, use the noop plugin to configure the virtual function IB, is there anything to pay attention to?

foursunZero avatar Aug 08 '23 11:08 foursunZero

you need to set ibKubernetesEnabled in your secondary network attachment definition if you are using ib-sriov-cni. see: https://github.com/k8snetworkplumbingwg/ib-sriov-cni#configuration-reference

adrianchiris avatar Aug 08 '23 12:08 adrianchiris

@adrianchiris Ok,I got it. but how should I configure the annotations with the pod. This is the test pod as follow

apiVersion: v1
kind: Pod
metadata:
  name: sriov-ib
  namespace: ib
  annotations:
    k8s.v1.cni.cncf.io/networks: ib@ib
    mellanox.infiniband.app: ib
spec:
  nodeName: sg-dcu002.hogpu.cc
  containers:
    - name: cni
      image: mellanox/rping-test
      imagePullPolicy: IfNotPresent
      command:
        - sleep
        - 3600s
      resources:
        requests:
          mellanox.com/sriov_ib_rdma: '1'
        limits:
          mellanox.com/sriov_ib_rdma: '1'

But I get some error

infiniBand SRI-OV CNI failed to configure VF "failed to add node guid 02:11:22:33:44:55:02:02: operation not supported"

where should I configure the annotation, and what content should I apply

foursunZero avatar Aug 09 '23 03:08 foursunZero