Retina agent pods crash looping after node upgrade (AKS)
Describe the bug After a node upgrade to AKSUbuntu-2004gen2fipscontainerd-202503.02.0 my retina agents are crash looping.
To Reproduce Steps to reproduce the behavior:
- Create a private AKS cluster (v1.30.4) with Azure monitor enabled
- Ensure your system node pool uses the AKSUbuntu-2004gen2fipscontainerd-202503.02.0 node image version
- Observe the retina agent pods crash looping.
Expected behavior The retina service to run as expected.
Screenshots
Platform (please complete the following information):
- OS: AKSUbuntu-2004gen2fipscontainerd-202503.02.0
- Kubernetes Version: [e.g. 1.30.4]
- Host: AKS
- Retina Version: v0.0.25
Additional context Add any other context about the problem here.
this might be a kernel issue, the kprobe is trying attach to a non-existence kernel func i think?
May I know when will fix this issue, we met the same issue for AKSUbuntu-2004gen2fipscontainerd-202503.13.0"
@matthew-duval Is this arm64 or amd64? Can you provide the kernel version too?
Amd64. Here is screenshot of the node system info: