Error while scaling a deployment
I guess this has been seen before, but I couldn't find any related issue. I'm not sure this is actually a bug, or more a "wrong" log level for this kind of message.
How to reproduce
- Create a deployment with number of replicas > 1
- Wait until the PDB gets created through the pdb-controller
- Scale down the deployment to replicas=1
- The pdb-controller logs an error as show below
- The PDB for the deployment gets deleted as expected
The log entries for the pdb-controller in that case logs following error (on GKE 1.27.11 and Docker Desktop Kubernetes 1.28.2):
pdb-controller-7bb68c7cc7-xqg2l pdb-controller time="2024-04-19T13:01:13Z" level=info action=added namespace=web pdb=nginx-deployment-pdb-controller selector="&LabelSelector{MatchLabels:map[string]string{app: nginx-test,},MatchExpressions:[]LabelSelectorRequirement{},}"
pdb-controller-7bb68c7cc7-xqg2l pdb-controller time="2024-04-19T13:02:13Z" level=info action=removed namespace=web pdb=nginx-deployment-pdb-controller selector="&LabelSelector{MatchLabels:map[string]string{app: nginx-test,},MatchExpressions:[]LabelSelectorRequirement{},}"
pdb-controller-7bb68c7cc7-xqg2l pdb-controller time="2024-04-19T13:02:13Z" level=error msg="Failed to update PDB: Operation cannot be fulfilled
Sample deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx-test
template:
metadata:
labels:
app: nginx-test
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
To report a new bug you should open an issue that summarizes the bug and set the label to "bug".
It seems I can't add labels, sorry.
@abuechler Is maybe some of the last log line missing? I assume the log is that it can't modify the pdb because it was modified by another resource just before.
@abuechler Is maybe some of the last log line missing? I assume the log is that it can't modify the pdb because it was modified by another resource just before.
No, there are no more log lines afterwards. The logs look exactly the same for GKE and on Docker Desktop (with no other components running).
Ok, I also see this sort of error in our setup. It looks like this:
Failed to update PDB: Operation cannot be fulfilled on poddisruptionbudgets.policy \"foo\": StorageError: invalid object, Code: 4, Key: /registry-foo1/poddisruptionbudgets/default/foo, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 9263b1a8-a10e-4e6f-93ad-fbd42ddbcf18, UID in object meta:
This could be that it tries to update after it was already deleted. This seems like a small bug. It should still work though as you describe.
@mikkeloscar Thank you for maintaining this project and reacting so quickly! 🙏
Is there any update on this? We're seeing the same error in our production logs. Is this something that can be safely ignored or does it need addressed?
@michohl I merged your pr, thanks for that one! Please report back if the logs reduce or are fixed. As far as I understand from @mikkeloscar response the problem is only a logging problem and nothing serious.
If we can close the issue would be of course great 😊
@szuecs I can confirm that the change I submitted resolves the erroneous errors (agreed they weren't actually harmful) in all of our own clusters. I don't explicitly run the upstream code though, we run a very slightly tweaked fork.
So I would defer the final confirmation to @abuechler just in case there's some weird change in our fork that makes the change work better. I can't imagine any scenario that would be true but I would hate to tell you it's good without testing the exact code in question.