cloud-sql-proxy-operator
cloud-sql-proxy-operator copied to clipboard
Randomly failing to update existing deployments, failing to create the csql container
Expected Behavior
It should be able to do rolling updates without breaking and be self sufficient in (re)creating the csql container when the selector matches or it fails to be created
Actual Behavior
Fails to update and never recovers by itself
Steps to Reproduce the Problem
- Create deployment with a
needs-proxy: "1"label - Create
AuthProxyWorkloadmanifest to match thekind: Deploymentandselector.matchLabels."needs-proxy" = "1" - First
kubectl applyusually works, updating deployments fail half of the time. This error precede this behavior and never recovers by itself, needing to delete the entire pod:
{
"textPayload": "2024/09/25 20:04:39 http: TLS handshake error from 192.168.1.3:43058: EOF",
"resource": {
"type": "k8s_container",
"labels": {
"namespace_name": "cloud-sql-proxy-operator-system",
"container_name": "manager",
"pod_name": "cloud-sql-proxy-operator-controller-manager-..."
}
},
"timestamp": "2024-09-25T20:04:39.822373239Z",
"severity": "ERROR",
"labels": {
"k8s-pod/pod-template-hash": "6946569c9b",
"k8s-pod/control-plane": "controller-manager"
},
"logName": "projects/.../logs/stderr",
"receiveTimestamp": "2024-09-25T20:04:42.868056671Z"
}
It then proceeds creating the actual Deployment container, but since it doesn't have the SQL proxy listening on localhost, the new created pod will be in an infinite crash loop since it requires the DB connection.
Specifications
- Version: 1.5.1
- Platform: GKE
Side note: it's very hard to read the logs from this operator on GCP, everything is being put on stderr with ERROR severity and the non-structured payloads is very confusing