ci: add test for rootful docker
I am finding with testing that the networking between hosts does not work when we are running in rootful. I was testing this because using nvidia devices does work with rootful, but once I got to the stop of needing pods to communicate, there was no communication.
I am not sure about the error, but this test should reproduce it in CI. Note that to enable this we use the docker-rootful template provided by lima (@AkihiroSuda you have thought of all things)! The main changes here are to add this test to the matrix, and ensure that in the different install scripts, we largely do nothing if the container runtime is docker-rootful.
Related to #365 but does not fix it, only demonstrates it.
Note that I've seen two variants of this error - either an operation timeout (the result here):
Or that the address is not reachable / bad (what I've seen in production and my researchapps testing CI):
Thanks, I confirmed that this issue happens on my local machines too, but I haven't identified the cause.
Tested with Docker v28 and v27.5.1, on Ubuntu 24.04.1 (ARM64).
I think it was working in the past?
ICMP and DNS still seems to work, but TCP across the nodes seems broken?
VXLAN packets are apparently sent and received on each of the VMs, though. (Run tcpdump udp).
Apparently, the receiver VM seems refusing to route the VXLAN packets to the usernetes-node-1 container where kubelet, flannel, etc. are running in.
Found a workaround: execute ethtool --offload eth0 tx-checksum-ip-generic off in usernetes-node-1 container
Any eyes needed here from the Moby networking folks? (I know they're pretty busy currently, but if it's useful I can try ask them if they have time to spare to give it eyes)
@AkihiroSuda do you remember the last time you tested with it working? In recent memory we had updates to flannel, the underlying kind node (Kubernetes version), and (for me) at some point last year the additional make sync-external-ip was added. If we can reproduce a previously working version it could be a good strategy to debug (to compare to).
oh wow, this is really interesting!
Not sure if this is expected, but this looks to be a warning in the failed nerdctl setup:
Warning: 7m[WARNING] buildkitd has access to images in "buildkit" namespace by default. If you want to give buildkitd access to the images in "default" namespace, run this command with CONTAINERD_NAMESPACE=default
The ethtool --offload eth0 tx-checksum-ip-generic off rule can be probably appended here:
https://github.com/rootless-containers/usernetes/blob/b259da818f84fe33fe9ea32c71c9ea7317d467cc/Dockerfile.d/etc_udev_rules.d_90-flannel.rules#L1-L5
It is still unclear why this is needed only for rootful, though.
Any eyes needed here from the Moby networking folks? (I know they're pretty busy currently, but if it's useful I can try ask them if they have time to spare to give it eyes)
Thanks, that would be appreciated.
Warning: 7m[WARNING] buildkitd has access to images in "buildkit" namespace by default. If you want to give buildkitd access to the images in "default" namespace, run this command with CONTAINERD_NAMESPACE=default
Irrelevant to the topic. Should be fixed though.
@vsoch Do you plan to continue this?
I would like to - from this comment: https://github.com/rootless-containers/usernetes/pull/366#issuecomment-2681540363 I thought we were waiting feedback from the Moby networking folks. Is the next step to try adding that line ethtool --offload eth0 tx-checksum-ip-generic off to the flannel rules?
Is the next step to try adding that line ethtool --offload eth0 tx-checksum-ip-generic off to the flannel rules?
Yes (when running in rootful), and let's call it a day
/cc @robmry @akerouanton
Sounds good - I'll make some time in the next few days. It's after 1am here so I need to be off to sleep, but this is on my todo. Thanks for the ping @AkihiroSuda.
Access from outside a host to container addresses inside bridge networks got blocked in moby 28.0, is that the issue? https://www.docker.com/blog/docker-engine-28-hardening-container-networking-by-default/
If running dockerd with env var DOCKER_INSECURE_NO_IPTABLES_RAW=1 makes it work - that's the issue. Either way, I'd like to know more about what the network looks like - is it direct routing between container addresses, or do you have an overlay network in there?
@AkihiroSuda I tried both approaches suggested above, still issues. I left both commits / changes for feedback. Let me know what I should try next.
@AkihiroSuda do you have another suggestion for what to try here? We'd like to try rootless soon - we have some overhead running rootless and want to test if running with rootful removes it (and then we could deduce it's something about user space).