cloudnative-pg icon indicating copy to clipboard operation
cloudnative-pg copied to clipboard

[Bug]: Operator restarting due to DetectAvailableArchitectures()

Open sxd opened this issue 1 year ago • 0 comments

Is there an existing issue already for this bug?

  • [X] I have searched for an existing issue, and could not find anything. I believe this is a new bug.

I have read the troubleshooting guide

  • [X] I have read the troubleshooting guide and I think this is a new bug.

I am running a supported version of CloudNativePG

  • [X] I have read the troubleshooting guide and I think this is a new bug.

Contact Details

No response

Version

1.23.0

What version of Kubernetes are you using?

1.30 (unsuppprted)

What is your Kubernetes environment?

Self-managed: kind (evaluation)

How did you install the operator?

YAML manifest

What happened?

This issue seems to be present overall, but so far I’ve only been able to reproduce it in certain cloud providers (mainly EKS, but also AKS).

What’s happening is that utils.DetectAvailableArchitectures() is slowing down RunController() enough so that the ReadinessProbePeriod 10 seconds are not respected anymore, and the pod gets killed by the Kubelet (and thus restart).

DetectAvailableArchitectures() should be calculating each architecture’s sha256 hash asynchronously, so it shouldn’t lock the startup of the manager. Needs more investigation to understand if the function is not working properly or if we are just hitting the timeout.

Cluster resource

No response

Relevant log output

No response

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

sxd avatar May 02 '24 12:05 sxd