pdb-controller icon indicating copy to clipboard operation
pdb-controller copied to clipboard

Error while scaling a deployment

Open abuechler opened this issue 1 year ago • 8 comments

I guess this has been seen before, but I couldn't find any related issue. I'm not sure this is actually a bug, or more a "wrong" log level for this kind of message.

How to reproduce

  1. Create a deployment with number of replicas > 1
  2. Wait until the PDB gets created through the pdb-controller
  3. Scale down the deployment to replicas=1
  4. The pdb-controller logs an error as show below
  5. The PDB for the deployment gets deleted as expected

The log entries for the pdb-controller in that case logs following error (on GKE 1.27.11 and Docker Desktop Kubernetes 1.28.2):

pdb-controller-7bb68c7cc7-xqg2l pdb-controller time="2024-04-19T13:01:13Z" level=info action=added namespace=web pdb=nginx-deployment-pdb-controller selector="&LabelSelector{MatchLabels:map[string]string{app: nginx-test,},MatchExpressions:[]LabelSelectorRequirement{},}"
pdb-controller-7bb68c7cc7-xqg2l pdb-controller time="2024-04-19T13:02:13Z" level=info action=removed namespace=web pdb=nginx-deployment-pdb-controller selector="&LabelSelector{MatchLabels:map[string]string{app: nginx-test,},MatchExpressions:[]LabelSelectorRequirement{},}"
pdb-controller-7bb68c7cc7-xqg2l pdb-controller time="2024-04-19T13:02:13Z" level=error msg="Failed to update PDB: Operation cannot be fulfilled 

Sample deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx-test
  template:
    metadata:
      labels:
        app: nginx-test
    spec:
      containers:
        - name: nginx-container
          image: nginx:latest
          ports:
            - containerPort: 80

abuechler avatar Apr 19 '24 13:04 abuechler

To report a new bug you should open an issue that summarizes the bug and set the label to "bug".

It seems I can't add labels, sorry.

abuechler avatar Apr 19 '24 13:04 abuechler

@abuechler Is maybe some of the last log line missing? I assume the log is that it can't modify the pdb because it was modified by another resource just before.

mikkeloscar avatar Apr 19 '24 14:04 mikkeloscar

@abuechler Is maybe some of the last log line missing? I assume the log is that it can't modify the pdb because it was modified by another resource just before.

No, there are no more log lines afterwards. The logs look exactly the same for GKE and on Docker Desktop (with no other components running).

abuechler avatar Apr 19 '24 14:04 abuechler

Ok, I also see this sort of error in our setup. It looks like this:

Failed to update PDB: Operation cannot be fulfilled on poddisruptionbudgets.policy \"foo\": StorageError: invalid object, Code: 4, Key: /registry-foo1/poddisruptionbudgets/default/foo, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 9263b1a8-a10e-4e6f-93ad-fbd42ddbcf18, UID in object meta: 

This could be that it tries to update after it was already deleted. This seems like a small bug. It should still work though as you describe.

mikkeloscar avatar Apr 19 '24 15:04 mikkeloscar

@mikkeloscar Thank you for maintaining this project and reacting so quickly! 🙏

abuechler avatar Apr 19 '24 15:04 abuechler

Is there any update on this? We're seeing the same error in our production logs. Is this something that can be safely ignored or does it need addressed?

michohl avatar Aug 05 '24 19:08 michohl

@michohl I merged your pr, thanks for that one! Please report back if the logs reduce or are fixed. As far as I understand from @mikkeloscar response the problem is only a logging problem and nothing serious.

If we can close the issue would be of course great 😊

szuecs avatar Aug 15 '24 10:08 szuecs

@szuecs I can confirm that the change I submitted resolves the erroneous errors (agreed they weren't actually harmful) in all of our own clusters. I don't explicitly run the upstream code though, we run a very slightly tweaked fork.

So I would defer the final confirmation to @abuechler just in case there's some weird change in our fork that makes the change work better. I can't imagine any scenario that would be true but I would hate to tell you it's good without testing the exact code in question.

michohl avatar Aug 18 '24 13:08 michohl