edgeapps
edgeapps copied to clipboard
OpenNESS (20.09.01) interfaceservice config for the OpenVINO samples
Hello,
I have built a network edge deployment (OpenNESS 20.09.01) cluster between two QEMU/KVM machines, and have successfully tested producer/consumer apps. The edge node (worker) is provisioned with two network interfaces (eth0 and eth1) with the idea of dedicating one to management/control and the 2nd to data. The worker connects to the master via eth0 and eth1 is on a different subnet.
To route traffic from an external machine to the OpenVINO apps, I am unable to bridge the 2nd (data) network interface on the Edge node to the OVS as suggested in the documentation.
Specifically, a listing of interfaces on the worker node returns two kernel interfaces
(which is expected-eth0 and eth1 respectively for control and data), but the output
does not include the MAC addresses of the interfaces.
[root@controller ~]# kubectl interfaceservice get node01
Kernel interfaces:
0000:00:04.0 | | detached
0000:00:05.0 | | detached
Further, attaching the interface fails as follows:
[root@controller ~]# kubectl interfaceservice attach node01 0000:00:05.0
Error when executing command: [attach] err: rpc error: code = Unknown desc = ovs-vsctl: port name must not be empty
string
: exit status 1
Is anything misconfigured on my Edge node? Please advise.
If it helps, below is a listing of all PCI devices on the Edge/worker node which indicates the two Ethernet controllers.
[root@node01 ~]# lspci -Dmm
0000:00:00.0 "Host bridge" "Intel Corporation" "440FX - 82441FX PMC [Natoma]" -r02 "Red Hat, Inc." "Qemu virtual ma
chine"
0000:00:01.0 "ISA bridge" "Intel Corporation" "82371AB/EB/MB PIIX4 ISA" -r03 "" ""
0000:00:01.3 "Bridge" "Intel Corporation" "82371AB/EB/MB PIIX4 ACPI" -r03 "" ""
0000:00:03.0 "Non-VGA unclassified device" "Red Hat, Inc." "Virtio SCSI" "Red Hat, Inc." "Device 0008"
0000:00:04.0 "Ethernet controller" "Red Hat, Inc." "Virtio network device" "Red Hat, Inc." "Device 0001"
0000:00:05.0 "Ethernet controller" "Red Hat, Inc." "Virtio network device" "Red Hat, Inc." "Device 0001"
0000:00:06.0 "Unclassified device [00ff]" "Red Hat, Inc." "Virtio RNG" "Red Hat, Inc." "Device 0004"
Also, below is the status of all pods as seen from the master:
kube-system coredns-66bff467f8-q5nmx 1/1 Running 1 15h
kube-system descheduler-cronjob-1605500040-n7kdq 0/1 Completed 0 13h
kube-system descheduler-cronjob-1605500160-9jwbc 0/1 Completed 0 13h
kube-system descheduler-cronjob-1605500280-c2bkc 0/1 Completed 0 13h
kube-system etcd-controller 1/1 Running 1 15h
kube-system kube-apiserver-controller 1/1 Running 1 15h
kube-system kube-controller-manager-controller 1/1 Running 1 15h
kube-system kube-multus-ds-amd64-fl9hr 1/1 Running 1 15h
kube-system kube-multus-ds-amd64-xdxq5 1/1 Running 1 13h
kube-system kube-ovn-cni-f9bbd 1/1 Running 2 15h
kube-system kube-ovn-cni-lcbh5 1/1 Running 1 14h
kube-system kube-ovn-controller-75775847c8-8njqv 1/1 Running 1 15h
kube-system kube-ovn-controller-75775847c8-c7dlf 1/1 Running 1 15h
kube-system kube-proxy-hhv5b 1/1 Running 1 15h
kube-system kube-proxy-rwr8m 1/1 Running 1 14h
kube-system kube-scheduler-controller 1/1 Running 2 14h
kube-system ovn-central-7585cd4b5c-6qcgf 1/1 Running 1 15h
kube-system ovs-ovn-cmg8q 1/1 Running 1 15h
kube-system ovs-ovn-jlv9n 1/1 Running 1 14h
kubevirt virt-api-f94f8b959-tsxkt 1/1 Running 1 13h
kubevirt virt-api-f94f8b959-vwx2v 1/1 Running 1 13h
kubevirt virt-controller-64766f7cbf-hs6vq 1/1 Running 1 13h
kubevirt virt-controller-64766f7cbf-sjf4n 1/1 Running 1 13h
kubevirt virt-handler-hqqgw 1/1 Running 1 13h
kubevirt virt-operator-79c97797-k5k9x 1/1 Running 1 15h
kubevirt virt-operator-79c97797-xf7pv 1/1 Running 1 15h
openness docker-registry-deployment-54d5bb5c-dzvkl 1/1 Running 1 15h
openness eaa-5c87c49c9-bvlfq 1/1 Running 1 13h
openness edgedns-gh8dt 1/1 Running 1 13h
openness interfaceservice-dxb5n 1/1 Running 1 13h
openness nfd-release-node-feature-discovery-master-6cf7cf5f69-m9gbw 1/1 Running 1 14h
openness nfd-release-node-feature-discovery-worker-r6gxx 1/1 Running 1 14h
telemetry cadvisor-j9lcm 2/2 Running 2 14h
telemetry collectd-nd2r5 2/2 Running 2 14h
telemetry custom-metrics-apiserver-54699b845f-h7td4 1/1 Running 1 14h
telemetry grafana-6b79c984b-vjhx7 2/2 Running 2 14h
telemetry otel-collector-7d5b75bbdf-wpnmv 2/2 Running 2 14h
telemetry prometheus-node-exporter-njspv 1/1 Running 3 14h
telemetry prometheus-server-776b5f44f-v2fgh 3/3 Running 3 14h
telemetry telemetry-aware-scheduling-68467c4ccd-q2q7h 2/2 Running 2 14h
telemetry telemetry-collector-certs-dklhx 0/1 Completed 0 14h
telemetry telemetry-node-certs-xh9bv 1/1 Running 1 14h
Hi @kmaloor,
The interfaceservice reads the MAC address of the interfaces on the edgenode from the system itself via the Golang net package. Looking at the OVS documentation for ovs-vsctl, it looks for the interface ID and MAC address of an interface as well when it is adding a new interface to an OVS bridge.
Could you check if both interfaces on the edgenode have MAC addresses associated with them please?
Hi @cjnolan
Thank you for the response. It appears that the two NICs have valid MAC addresses assigned:
[root@node01 ~]# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1460
inet 10.128.0.14 netmask 255.255.255.255 broadcast 10.128.0.14
inet6 fe80::4001:aff:fe80:e prefixlen 64 scopeid 0x20<link>
ether 42:01:0a:80:00:0e txqueuelen 1000 (Ethernet)
RX packets 1561 bytes 1176344 (1.1 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1568 bytes 259782 (253.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[root@node01 ~]# ifconfig eth1
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1460
inet 10.1.128.2 netmask 255.255.255.255 broadcast 10.1.128.2
inet6 fe80::4001:aff:fe01:8002 prefixlen 64 scopeid 0x20<link>
ether 42:01:0a:01:80:02 txqueuelen 1000 (Ethernet)
RX packets 6 bytes 1310 (1.2 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 12 bytes 1512 (1.4 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[root@node01 ~]# cat /sys/class/net/eth0/address
42:01:0a:80:00:0e
[root@node01 ~]# cat /sys/class/net/eth1/address
42:01:0a:01:80:02
[root@node01 ~]#
It appears that the framework queries the machine's network interfaces through here: https://github.com/open-ness/edgenode/blob/f4d4f9afe37ba8d1f9d44e1e35fe57de2ddb4656/pkg/ela/helpers/network_interfaces.go#L103
Do you think something may be going wrong in this block? https://github.com/open-ness/edgenode/blob/f4d4f9afe37ba8d1f9d44e1e35fe57de2ddb4656/pkg/ela/helpers/network_interfaces.go#L116
Hi @cjnolan
Just a quick follow-up. By manually reading the uevent as over here: https://github.com/open-ness/edgenode/blob/f4d4f9afe37ba8d1f9d44e1e35fe57de2ddb4656/pkg/ela/helpers/network_interfaces.go#L117 from inside the interfaceservice container, I see the following contents:
[root@controller ~]# kubectl exec -it -n openness interfaceservice-dxb5n /bin/sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD]
-- [COMMAND] instead.
sh-4.2$ cat /var/host_net_devices/eth1/device/uevent
DRIVER=virtio_net
MODALIAS=virtio:d00000001v00001AF4
It appears that the interfaceservice code wants to extract the PCI address corresponding to network interfaces (by interface name) so as to collate all the metadata (which includes the MAC address) into devs []NetworkDevice. As the uevent excludes this information, the subsequent logic fails.
As such, I'm wondering if the interfaceservice code needs to use an alternative method to collect the PCI address. As an example, using the output of ethtool:
[root@node01 ~]# ethtool -i eth1
driver: virtio_net
version: 1.0.0
firmware-version:
expansion-rom-version:
bus-info: 0000:00:05.0
supports-statistics: no
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no
If the above output were placed into the content
variable in
https://github.com/open-ness/edgenode/blob/f4d4f9afe37ba8d1f9d44e1e35fe57de2ddb4656/pkg/ela/helpers/network_interfaces.go#L119
then I reckon that the rest of the logic would work as expected.
Hi @kmaloor
The cause of the failure with the interfaceservice is that the uevent is not being populated with the PCI addresses of the network interfaces. This appears to be due to the edgenode being deployed on a virtual machine with two virtual network interfaces. This configuration of the edgenode on a virtual machine was not tested with OpenNESS however the interfaceservice has been tested and validated with the edgenode on baremetal.
Hi @cjnolan
Thank you for your response. Yes, my Edge node has two (virtio) Ethernet controllers with distinct PCI addresses.
[root@node01 ~]# lshw -c network
*-network:0
description: Ethernet controller
product: Virtio network device
vendor: Red Hat, Inc.
physical id: 4
bus info: pci@0000:00:04.0
version: 00
width: 32 bits
clock: 33MHz
capabilities: msix bus_master cap_list
configuration: driver=virtio-pci latency=0
resources: irq:10 ioport:c040(size=64) memory:c0001000-c000107f
*-virtio1
description: Ethernet interface
physical id: 0
bus info: virtio@1
logical name: eth0
serial: 42:01:0a:80:00:0e
capabilities: ethernet physical
configuration: broadcast=yes driver=virtio_net driverversion=1.0.0 ip=10.128.0.14 link=yes multicast=yes
*-network:1
description: Ethernet controller
product: Virtio network device
vendor: Red Hat, Inc.
physical id: 5
bus info: pci@0000:00:05.0
version: 00
width: 32 bits
clock: 33MHz
capabilities: msix bus_master cap_list
configuration: driver=virtio-pci latency=0
resources: irq:10 ioport:c000(size=64) memory:c0000000-c000007f
*-virtio2
description: Ethernet interface
physical id: 0
bus info: virtio@2
logical name: eth1
serial: 42:01:0a:01:80:02
capabilities: ethernet physical
configuration: broadcast=yes driver=virtio_net driverversion=1.0.0 ip=10.1.128.2 link=yes multicast=yes
...
I am currently testing an OpenNESS edge deployment cluster on Google Cloud. I understand that the interfaceservice may be working as expected when run on baremetal.
I was wondering if there's any scope for altering the code such that it works everywhere? There could be use cases where an OpenNESS cluster is deployed over virtualized infrastructure on the edge.
Hi @kmaloor Yes, we can look to alter the interfaceservice to work with deployments on virtualized infrastructure as part of a future OpenNESS release.