gatekeeper
gatekeeper copied to clipboard
safe to evict emptyDir local storage to unblock the cluster downscaling.
Describe the solution you'd like Those pods which are using local storage, should have an annotation of cluster-autoscaler.kubernetes.io/safe-to-evict: "true" because - emptyDir will block the cluster downscaling.
Anything else you would like to add: Pods with volume of local storage volumes: - emptyDir: {} name: tmp-volume
Environment: PROD
- Gatekeeper version: 3.8
- Kubernetes version: (use
kubectl version
): Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.14-gke.700", GitCommit:"1781919224b267c523fd76047cebf7b14c6aa1d9", GitTreeState:"clean", BuildDate:"2022-06-28T09:30:29Z", GoVersion:"go1.16.15b7", Compiler:"gc", Platform:"linux/amd64"}
hey @ZiaUrRehman-GBI thanks for opening this issue. I'm going to spend some time looking into this and follow up here when I know more.
hey @ZiaUrRehman-GBI I had a look but I couldn't repro :/ . I see kubectl scale
work as expected on the g8r pods as defined in the latest config under deploy/
.
$ kubectl scale deployment/gatekeeper-controller-manager --replicas 10 -n gatekeeper-system
...
$ kubectl scale deployment/gatekeeper-controller-manager --replicas 1 -n gatekeeper-system
...
$ kubectl scale deployment/gatekeeper-audit --replicas 10 -n gatekeeper-system
...
$ kubectl scale deployment/gatekeeper-audit --replicas 1 -n gatekeeper-system
...
Let me ask you for a couple questions.
Apologies in advance if you already communicated more details in another channel. Please bear w me.
Tell us more about your environment
Pods
- How many pods are you running? How many as
audit
vscontroller-manager
? - How many pods are you trying to scale down/ up? Are you using the plain k8s autoscaler or relying on a cloud provider to use it?
- How many do pods you see not scaled?
- For those pods that don't get scaled, is there anything interesting in the logs
Volumes
- At a high level, what's on those volumes around the time of the scaling?
Hey @acpana, May be I didn't convey you properly or you don't get me. I don't mean to scale the opa gatekeepers deployment. I was talking about GKE scale down due to opa pods of local storage. But on GKE side after 1.22.x Relase they fixed this issue. So you are to feel free to close it. But on other provider like AWS and AKS, this problem still exist so they have to either provide this annotation or set skip-node-with-local-storage
+1 @ZiaUrRehman-GBI Thanks for reporting the issue. Would you like to open a PR to add the annotation for the audit pod?
Sure I will open. 👍
@ZiaUrRehman-GBI we already have podAnnotations
value in the chart, would that work? if so, sounds like we might want to document this in https://open-policy-agent.github.io/gatekeeper/website/docs/vendor-specific?
Thanks @sozercan! You can search for podAnnotations
in the chart readme https://github.com/open-policy-agent/gatekeeper/tree/master/charts/gatekeeper
Thanks, doc will help a lot
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.