converged-edge-experience-kits
converged-edge-experience-kits copied to clipboard
Cannot program the FPGA
Hi
Using this API to program the binary in the FPGA:
kubectl rsu program -f <signed_RTL_image> -n
The documentation doesn't specify the arguments clearly, so assume signed_RTL_image is the name of the .bin file. Hostname, for some reason, the API doesn't seem to work with the node name k8S controller sees, have to use the IP address. The PCI bus id is the what i get doing "lspci |grep acc". Running this i see the, the fpga-opae.. container is in a pending state, complains about mismatch in the nodeselector, is there a corrsponding .yml i can modify to remove this filtering..?
[root@corningopenness opt]# kubectl describe pods fpga-opae-10.12.87.80-0b30-xbrx2
Name: fpga-opae-10.12.87.80-0b30-xbrx2
Namespace: default
Priority: 0
Node:
IPs:
image-dir:
Type: HostPath (bare host directory volume)
Path: /temp/vran_images
HostPathType:
default-token-fzdsz:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-fzdsz
Optional: false
QoS Class: BestEffort
Node-Selectors: kubernetes.io/hostname=10.12.87.80
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
Warning FailedScheduling 58m default-scheduler 0/2 nodes are available: 2 node(s) didn't match node selector. Warning FailedScheduling 58m default-scheduler 0/2 nodes are available: 2 node(s) didn't match node selector
The node selector values are not matching because you have run the program command with IP. Please try running it with the node hostname. This will he the same as the one in /etc/hostname of the node.
As for the kubectl rsu command, please execute kubectl rsu discover prior to the program command and copy the signed or unsigned image name from the output of that command and also the device ID. Consequently, paste the values in the kubectl rsu program command.
Ok, sure, can you tell me which one is the device here 8086:0b30 or 54:00:0 [root@corningopenness ravi]# kubectl rsu discover -n 10.12.87.80
Available RTL images: [email protected]'s password:
Mar 10 44M 20ww27.5-2x2x25G-5GLDPC-v1.6.1-3.0.0-unsigned.bin
FPGA devices installed:
[email protected]'s password: 54:00.0 Processing accelerators [1200]: Intel Corporation Device [8086:0b30] Subsystem: Intel Corporation Device [8086:0000] Kernel driver in use: intel-fpga-pci Kernel modules: intel_fpga_pci
Was able to download the image to FPGA, was able to configure the VFs in the FPGA, but cannot see it in available resources, any idea..what might be the problem ?, it says the resources should map to ConfigMap.yml for device plugin, but where is correlated with the bb_config helm chart provisioning ?
[root@corningopenness helm-charts]# kubectl get node opennesswkn-1 -o json | jq '.status.allocatable' { "cpu": "46", "devices.kubevirt.io/kvm": "110", "devices.kubevirt.io/tun": "110", "devices.kubevirt.io/vhost-net": "110", "ephemeral-storage": "96589578081", "hugepages-1Gi": "20Gi", "intel.com/intel_sriov_netdevice": "12", "memory": "110455600Ki", "pods": "110" }
[root@corningopenness helm-charts]# kubectl logs intel-fpga-cfg-opennesswkn-1-jwlpn ERROR: Section (FLR) or name (flr_time_out) is not valid. FEC FPGA RTL v3.0 UL.DL Weights = 3.3 UL.DL Load Balance = 128.128 Queue-PF/VF Mapping Table = READY Ring Descriptor Size = 256 bytes
--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | PF | VF0 | VF1 | VF2 | VF3 | VF4 | VF5 | VF6 | VF7 | --------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ UL-Q'00 | | X | | | | | | | | UL-Q'01 | | X | | | | | | | | UL-Q'02 | | X | | | | | | | | UL-Q'03 | | X | | | | | | | | UL-Q'04 | | X | | | | | | | | UL-Q'05 | | X | | | | | | | | UL-Q'06 | | X | | | | | | | | UL-Q'07 | | X | | | | | | | | UL-Q'08 | | X | | | | | | | | UL-Q'09 | | X | | | | | | | | UL-Q'10 | | X | | | | | | | | UL-Q'11 | | X | | | | | | | | UL-Q'12 | | X | | | | | | | | UL-Q'13 | | X | | | | | | | | UL-Q'14 | | X | | | | | | | | UL-Q'15 | | X | | | | | | | | UL-Q'16 | | | X | | | | | | | UL-Q'17 | | | X | | | | | | | UL-Q'18 | | | X | | |
Also posted this on github, mailing too expecting a faster response : Was able to download the image to FPGA, guess was able to configure the VFs in the FPGA, but cannot see it in available resources, any idea..what might be the problem ?, documentation says the resources should map to ConfigMap.yml for device plugin, but where is correlated with the bb_config helm chart provisioning ? @.*** helm-charts]# kubectl get node opennesswkn-1 -o json | jq '.status.allocatable' { "cpu": "46", "devices.kubevirt.io/kvm": "110", "devices.kubevirt.io/tun": "110", "devices.kubevirt.io/vhost-net": "110", "ephemeral-storage": "96589578081", "hugepages-1Gi": "20Gi", "intel.com/intel_sriov_netdevice": "12", "memory": "110455600Ki", "pods": "110" }
@.*** helm-charts]# kubectl logs intel-fpga-cfg-opennesswkn-1-jtg5f
ERROR: Section (FLR) or name (flr_time_out) is not valid.
FEC FPGA RTL v3.0
UL.DL Weights = 3.3
UL.DL Load Balance = 128.128
Queue-PF/VF Mapping Table = READY
Ring Descriptor Size = 256 bytes
--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| PF | VF0 | VF1 | VF2 | VF3 | VF4 | VF5 | VF6 | VF7 |
--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
UL-Q'00 | | X | | | | | | | |
UL-Q'01 | | X | | | | | | | |
UL-Q'02 | | X | | | | | | | |
UL-Q'03 | | X | | | | | | | |
UL-Q'04 | | X | | | | | | | |
UL-Q'05 | | X | | | | | | | |
UL-Q'06 | | X | | | | | | | |
UL-Q'07 | | X | | | | | | | |
UL-Q'08 | | X | | | | | | | |
UL-Q'09 | | X | | | | | | | |
UL-Q'10 | | X | | | | | | | |
UL-Q'11 | | X | | | | | | | |
UL-Q'12 | | X | | | | | | | |
UL-Q'13 | | X | | | | | | | |
UL-Q'14 | | X | | | | | | | |
UL-Q'15 | | X | | | | | | | |
UL-Q'16 | | | X | | | | | | |
UL-Q'17 | | | X | | | | | | |
UL-Q'18 | | | X | | | | | | |
UL-Q'19 | | | X | | | | | | |
UL-Q'20 | | | X | | | | | | |
UL-Q'21 | | | X | | | | | | |
UL-Q'22 | | | X | | | | | | |
UL-Q'23 | | | X | | | | | | |
UL-Q'24 | | | X | | | | | | |
UL-Q'25 | | | X | | | | | | |
UL-Q'26 | | | X | | | | | | |
UL-Q'27 | | | X | | | | | | |
UL-Q'28 | | | X | | | | | | |
UL-Q'29 | | | X | | | | | | |
UL-Q'30 | | | X | | | | | | |
UL-Q'31 | | | X | | | | | | |
DL-Q'32 | | X | | | | | | | |
DL-Q'33 | | X | | | | | | | |
DL-Q'34 | | X | | | | | | | |
DL-Q'35 | | X | | | | | | | |
DL-Q'36 | | X | | | | | | | |
DL-Q'37 | | X | | | | | | | |
DL-Q'38 | | X | | | | | | | |
DL-Q'39 | | X | | | | | | | |
DL-Q'40 | | X | | | | | | | |
DL-Q'41 | | X | | | | | | | |
DL-Q'42 | | X | | | | | | | |
DL-Q'43 | | X | | | | | | | |
DL-Q'44 | | X | | | | | | | |
DL-Q'45 | | X | | | | | | | |
DL-Q'46 | | X | | | | | | | |
DL-Q'47 | | X | | | | | | | |
DL-Q'48 | | | X | | | | | | |
DL-Q'49 | | | X | | | | | | |
DL-Q'50 | | | X | | | | | | |
DL-Q'51 | | | X | | | | | | |
DL-Q'52 | | | X | | | | | | |
DL-Q'53 | | | X | | | | | | |
DL-Q'54 | | | X | | | | | | |
DL-Q'55 | | | X | | | | | | |
DL-Q'56 | | | X | | | | | | |
DL-Q'57 | | | X | | | | | | |
DL-Q'58 | | | X | | | | | | |
DL-Q'59 | | | X | | | | | | |
DL-Q'60 | | | X | | | | | | |
DL-Q'61 | | | X | | | | | | |
DL-Q'62 | | | X | | | | | | |
DL-Q'63 | | | X | | | | | | |
--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
Mode of operation = VF-mode
FPGA_5GNR PF [0000:56:00.0] configuration complete!
From: aniket-intel @.> Reply-To: open-ness/openness-experience-kits @.> Date: Thursday, March 11, 2021 at 1:15 AM To: open-ness/openness-experience-kits @.> Cc: "Ravindran, Ravi (Ravishankar)" @.>, Author @.***> Subject: [EXTERNAL]--Re: [open-ness/openness-experience-kits] Cannot program the FPGA (#96)
Hi Ravi,
The node selector values are not matching because you have run the program command with IP. Please try running it with the node hostname. This will he the same as the one in /etc/hostname of the node.
As for the kubectl rsu command, please execute kubectl rsu discover prior to the program command and copy the signed or unsigned image name from the output of that command and also the device ID. Consequently, paste the values in the kubectl rsu program command.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/open-ness/openness-experience-kits/issues/96#issuecomment-796589458, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AS6LNA7D7VSDHPTPT4ZRBCTTDCCZDANCNFSM4Y7GJ2CQ.
Run the kubectl rsu discover command with hostname.
Also, the device ID here is 54:00.0. Only once the FPGA card is configured properly, it will show in the list of allocable resources.
I got this working after reinstalliing the worker node after programing the FPGA.