kubespray
kubespray copied to clipboard
Cilium doesn't work on Ubuntu 22.04
Environment:
-
Cloud provider or hardware configuration: Bare metal
-
OS (
printf "$(uname -srm)\n$(cat /etc/os-release)\n"
):
Linux 5.15.0-40-generic x86_64
PRETTY_NAME="Ubuntu 22.04 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
-
Version of Ansible (
ansible --version
): 2.10.15 -
Version of Python (
python --version
): 3.9.13
Kubespray version (commit) (git rev-parse --short HEAD
):
58e9324
Network plugin used: Cilium
In Ubuntu 22.04 we should disable rp_filter to make Cilium work.
https://github.com/cilium/cilium/issues/18131#issuecomment-988160016
@maxpain Thanks for submitting this issue. Could you provide actual error message of Kubespray to know which task was failed on Ubuntu 22.04?
Probably related to how Ubuntu has the setting in two different places: https://github.com/cilium/cilium/issues/20125#issuecomment-1185176384
Cilium just will not work with default Ubuntu settings (and one issue mentioned RHEL 9). If you just dump a config in the usual place to override the values, it's not enough.
This is actually fixed in the latest Cilium pre-release v1.12.0-rc3
This doesn't affect me and I don't use Kubespray at all, just aiming to be a friendly neighborhood nerd here.
This is actually https://github.com/cilium/cilium/pull/20072 in the latest Cilium pre-release v1.12.0-rc3
I see nothing in the cilium repo that patches /usr/lib/sysctl.d, which is necessary to "fix" this on Ubuntu 22.04 or newer if the desired state is to override the defaults.
@protosam see https://github.com/cilium/cilium/pull/20072/files#diff-1cadee1ea10bb25d793baf555b85040a00ff0bc7f049a2542f0c2590ab4e7f0fR39-R45
Not sure if this is enough to "fix" the problem. What I see is as follows:
-
sysctlConfig
is a variable containing what will become file contents. - Namely, whatever the value of
overwritesPath
is, that file will have the contents ofsysctlConfig
. - By default it is a single file that will resolve to become
/etc/sysctl.d/99-zzz-override_cilium.conf
, becausepath.Join(*sysctlD, *ciliumOverwrites)
.
So /etc/sysctl.d/99-zzz-override_cilium.conf
ends up with the following contents:
# Disable rp_filter on Cilium interfaces since it may cause mangled packets to be dropped
net.ipv4.conf.lxc*.rp_filter = 0
net.ipv4.conf.cilium_*.rp_filter = 0
# The kernel uses max(conf.all, conf.{dev}) as its value, so we need to set .all. to 0 as well.
# Otherwise it will overrule the device specific settings.
net.ipv4.conf.all.rp_filter = 0
In my testing, the contents of /usr/lib/sysctl.d/50-default.conf
seem to win out over any modifications made in /etc/sysctl.d/*.conf
.
Thanks canonical:
root@localhost:~# grep rp_fi /usr/lib/sysctl.d/50-default.conf
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.*.rp_filter = 2
#-net.ipv4.conf.all.rp_filter # Ubuntu uses /etc/sysctl.d/10-network-security.conf
In my own solutions, I go so far as to purge 50-default.conf
of rp_filter entries.
This is fixed since Cilium >= 1.9.18, >= 1.10.13 and >= 1.11.7. Thank you
It looks like /roles/network_plugin/cilium/templates/cilium/ds.yml.j2
is out of sync with https://github.com/cilium/cilium/blob/v1.11.7/install/kubernetes/cilium/templates/cilium-agent/daemonset.yaml. As a result apply-sysctl-overwrites
init container is missing and sysctl fix which is part of Cilium v1.11.7 is not getting applied at all.
There are other discrepancies between 2 files not relevant to this issue.
It looks like
/roles/network_plugin/cilium/templates/cilium/ds.yml.j2
is out of sync with https://github.com/cilium/cilium/blob/v1.11.7/install/kubernetes/cilium/templates/cilium-agent/daemonset.yaml. As a resultapply-sysctl-overwrites
init container is missing and sysctl fix which is part of Cilium v1.11.7 is not getting applied at all.There are other discrepancies between 2 files not relevant to this issue.
You're correct. Cilium manifests change rapidly over a short period. So I'll update Cilium to v1.12 with #9187 (hopefully without breaking backward compatibility) and will also try to look into these.