k8s_ros_sample
k8s_ros_sample copied to clipboard
K8S ROS2 multicluster comunication / upd-multicast
I'm working on a project that aims to share ROS2 on multiple Kubernetes clusters.
The idea is that every robot is an independent Kubernetes cluster, and using Liqo and other tools from the Fluidos project, we can offload ROS2 robotic loads to another robot.
I have a full simulation of Robotnik RB-Theron with Gazebo in ROS2, including all manifests, web RViz, and Gazebo implementations. Feel free to try. robotnik-logistic-use-case
The problem I have is that I'm using Calico and Flannel as CNI, and if I deploy everything on the same Kubernetes node, everything works fine. But it doesn't work when there are multiple K8s nodes; I think it's a UDP multicast problem. I will have more problems with a multicluster offload.
As @fujitatomoya pointed out, WeaveNet is one of the few CNIs that allows UDP multicast, but the open-source branch is not maintained anymore.
I contacted the Zenoh team, and we tried to use the Zenoh router to use a centralized topology on that simulation. It works with simple publishers and nodes, but the /tf of the simulation and other topics do not work properly at the moment.
Has anyone tried any other options that do not imply multicast with Gazebo/Webots/real-robot?
@ggari-robotnik thanks for posting the issue.
I have a full simulation of Robotnik RB-Theron with Gazebo in ROS2, including all manifests, web RViz, and Gazebo implementations. Feel free to try. robotnik-logistic-use-case
really helpful for user who wants to instantiate the RViz and Gazebo containers with k8s 👍 I believe in general, I also want to come up with How to start Gazebo tutorial in this doc as well, that can be one of the most common use case.https://github.com/fujitatomoya/ros_k8s/blob/master/docs/ROS2_Deployment_Intermediate.md#ros-2-gui-display-access
The problem I have is that I'm using Calico and Flannel as CNI, and if I deploy everything on the same Kubernetes node, everything works fine. But it doesn't work when there are multiple K8s nodes; I think it's a UDP multicast problem. I will have more problems with a multicluster offload.
yes, this is true.
see https://github.com/fujitatomoya/ros_k8s/blob/master/docs/Setup_Kubernetes_Cluster.md#container-network-interface-cni, this is well-known issue for ROS 2 and DDS multicast discovery.
WeaveNet is one of the few CNIs that allows UDP multicast, but the open-source branch is not maintained anymore.
in history, there has been many discussion on this E.O.L. but unfortunately this just happened.
my hope and motivation is to support https://github.com/cilium/cilium/issues/13239, since all the technical advantages of Cilium provides. (empowered by eBPF, Tetragon, ServiceMesh and SecurityPolicy) as CNI perspective.
I contacted the Zenoh team, and we tried to use the Zenoh router to use a centralized topology on that simulation.
I would probably add the tutorial to support https://docs.ros.org/en/rolling/Tutorials/Advanced/Discovery-Server/Discovery-Server.html in the tutorial. this is the common configuration for ROS 2, especially try to avoid the discovery multicast packet and physical environment does not support multicast.
@ggari-robotnik good news. Cilium supports multicast now, see the verified doc here https://github.com/fujitatomoya/ros_k8s/blob/master/docs/Setup_Kubernetes_Cluster.md#enable-multicast-wip
i think that we can now close this issue, feel free to reopen if anything is missing.