piraeus-operator
piraeus-operator copied to clipboard
satellite DaemonSet fails to create any of the required Pods due to missing Service Account
Fresh install of Piraeus Operator 2.8.0 on an Ubuntu 24.04 microk8s 1.31.5
Apart from the satellite DaemonSet, all Pods are coming up like expected:
piraeus-datastore linstor-controller-78b969876c-qtf58 1/1 Running 0 3m10s 10.1.19.29 k8s-13
Information on the events show the DaemonSet failure due to a missing Service Account:
2m25s (x16 over 5m9s) Warning FailedCreate DaemonSet/linstor-satellite.k8s-03 Error creating: pods "linstor-satellite.k8s-03-" is forbidden: error looking up service account piraeus-datastore/satellite: serviceaccount "satellite" not found 2m24s (x16 over 5m8s) Warning FailedCreate DaemonSet/linstor-satellite.k8s-11 Error creating: pods "linstor-satellite.k8s-11-" is forbidden: error looking up service account piraeus-datastore/satellite: serviceaccount "satellite" not found 2m24s (x16 over 5m8s) Warning FailedCreate DaemonSet/linstor-satellite.k8s-12 Error creating: pods "linstor-satellite.k8s-12-" is forbidden: error looking up service account piraeus-datastore/satellite: serviceaccount "satellite" not found 2m22s (x16 over 5m7s) Warning FailedCreate DaemonSet/linstor-satellite.k8s-13 Error creating: pods "linstor-satellite.k8s-13-" is forbidden: error looking up service account piraeus-datastore/satellite: serviceaccount "satellite" not found 2m21s (x16 over 5m6s) Warning FailedCreate DaemonSet/linstor-satellite.k8s-01 Error creating: pods "linstor-satellite.k8s-01-" is forbidden: error looking up service account piraeus-datastore/satellite: serviceaccount "satellite" not found 2m21s (x16 over 5m5s) Warning FailedCreate DaemonSet/linstor-satellite.k8s-02 Error creating: pods "linstor-satellite.k8s-02-" is forbidden: error looking up service account piraeus-datastore/satellite: serviceaccount "satellite" not found
Looking at the created Service Accounts in the associated namespace shows that it indeed was not created:
kubectl get serviceaccount -n piraeus-datastore
NAME SECRETS AGE default 0 4h1m linstor-controller 0 5m56s linstor-csi-controller 0 5m55s piraeus-operator-controller-manager 0 4h1m piraeus-operator-gencert 0 4h1m
Deleting the Operator and deploying it again does not resolve the issue, with the Service Account not being created.
It looks like the Operator could not complete the full reconciliation of the LinstorCluster resource. Can you check the .status of the LinstorCluster resource?
Hello @WanzenBug and thank you so much for your prompt reply. Apologies for the delay in responding, had some weird issues that made me put this into the back burner for a while.
I traced the failure to a copy/paste issue with the required patches for the LinstorCluster definition when running on MicroK8s. After correcting the typo and resolving a weird issue with one of the nodes not having the correct "kubelet" link inside "/var/lib" things are looking much better.
On the Ubuntu 24.04.2 LTS x64 based nodes, all pods have come up successfully.
However, on the Debian Bookworm arm64 base nodes, the linstor-satellite pods are failing to come up due to the drbd-module-loader container failing what appears to be the "make" of the required kernel modules:
A kubectl logs -n piraeus-datastore -c drbd-module-loader linstor-satellite.k8s-11-6ltw2 outputs the following:
Need a git checkout to regenerate drbd/.drbd_git_revision
make[1]: Entering directory '/tmp/pkg/drbd-9.2.12/drbd'
Calling toplevel makefile of kernel source tree, which I believe is in
KDIR=/lib/modules/6.6.62+rpt-rpi-v8/build
make -C /lib/modules/6.6.62+rpt-rpi-v8/build "PRE_CFLAGS=" M=/tmp/pkg/drbd-9.2.12/drbd obj-m=dummy-for-compat.o dummy-for-compat-h.o
/usr/src/linux-headers-6.6.62+rpt-common-rpi/Makefile:1032: /usr/src/linux-headers-6.6.62+rpt-common-rpi/scripts/Makefile.extrawarn: No such file or directory
make[2]: *** No rule to make target '/usr/src/linux-headers-6.6.62+rpt-common-rpi/scripts/Makefile.extrawarn'. Stop.
make[1]: Leaving directory '/tmp/pkg/drbd-9.2.12/drbd'
make[1]: *** [Makefile:236: compat.h] Error 2
make: *** [Makefile:131: module] Error 2
Could not find the expexted *.ko, see stderr for more details
However, checking the arm64 based nodes shows that the files that supposedly can't be accessed, /usr/src/linux-headers-6.6.62+rpt-common-rpi/Makefile:1032: /usr/src/linux-headers-6.6.62+rpt-common-rpi/scripts/Makefile.extrawarn: No such file or directory are present at the filesystem
Any idea on what might be going on here that can point me in the right direction?
The issue is probably that we only try to mount /usr/src from the host. I think on debian-systems there are some scripts that are moved into a separate directory and only symlinked. What's the output of:
readlink -f /usr/src/linux-headers-6.6.62+rpt-common-rpi/scripts/Makefile.extrawarn
That seems to be the case indeed. This is the output:
@k8s-11:~$ readlink -f /usr/src/linux-headers-6.6.62+rpt-common-rpi/scripts/Makefile.extrawarn
/usr/lib/linux-kbuild-6.6.62+rpt/scripts/Makefile.extrawarn
And these are the contents of /usr/src
@k8s-11:~$ ls -las /usr/src/
total 24
4 drwxr-xr-x 6 root root 4096 Feb 28 15:51 .
4 drwxr-xr-x 11 root root 4096 Mar 15 2024 ..
4 drwxr-xr-x 4 root root 4096 Nov 28 23:05 linux-headers-6.6.62+rpt-common-rpi
4 drwxr-xr-x 4 root root 4096 Nov 28 23:05 linux-headers-6.6.62+rpt-rpi-v8
4 drwxr-xr-x 4 root root 4096 Feb 28 15:49 linux-headers-6.6.74+rpt-common-rpi
4 drwxr-xr-x 4 root root 4096 Feb 28 15:49 linux-headers-6.6.74+rpt-rpi-v8
0 lrwxrwxrwx 1 root root 30 Nov 25 15:28 linux-kbuild-6.6.62+rpt -> ../lib/linux-kbuild-6.6.62+rpt
0 lrwxrwxrwx 1 root root 30 Jan 27 17:19 linux-kbuild-6.6.74+rpt -> ../lib/linux-kbuild-6.6.74+rpt
🤔 I'm wondering if it would be simpler to install the drbd-dkms package directly on the host.
Directly from the http://packages.linbit.com/public/ repo?
Yeah. You might need to pretend to be "proxmox-8" because the public bookworm repos do not have the dkms package installed.
Thanks for the help, that worked indeed as expected.
Wondering if there are any plans to fix the compilation issues due to how Bookworm has the /usr/src organized or if this will be the way forward to get the solution properly working on Bookworm for now?
The issue is notably not with the normal upstream bookworm (which works just fine), but with the Raspbian variant, which has a custom kernel. For the normal bookworm image, we already install all the linux-kbuild-* packages, so we have the necessary files in /usr/src. But that does not work in this case.
Ideas how to address this welcome 😄